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INHERITED AND SOMATIC MUTATIONS OF 
APC GENE IN COLORECTAL CANCER OF HUMANS 

The U.S. Government has a paid-up license in this Invention and 
the right in limited circumstances to require the patent owner to 
license others on reasonable terms as provided for by the terms of 
grants awarded by the National Institutes o! Health. 
TECHNICAL AREA OF THE INVENTION 

The invention relates to the area of cancer diagnostics and the:-' 
apeutics. More particularly, the invention relates to detection of the 
germline and somatic alterations of wild-type APC genes. In addition, 
it relates to therapeutic intervention to restore the function of APC 
gene product. 

BACKGROUND OF THE INVENTION 

According to the model of Knudson for tumorigenesis (Cancer 
Research, Vol. 45, p. 1482, 1985), there are tumor suppressor genes in 
all normal cells which, when they become non-functional due to muta- 
tion, cause neoplastic development. Evidence for this model has been 
found in the cases of retinoblastoma and colorectal tumors. The impli- 
cated suppressor genes in those tumors, Rfi, p53, DCC and MCC, were 
found to be deleted or altered in many cases of the tumors studied. 
(Hansen and Cavenee, Cancer Research, Vol.. 47 pp. 5518-5527 (1987); 
Baker et al M Science, Vol,. 244, p. 217 (1989); Fearon et al.. Science, 
Vol. 247, p. 49 (1990;; Kinder et al. Science Vol. 251. p. 136S (1991).) 

In order to fully understand the pathogenesis of tumors, it will 
be necessary to identify the other suppressor genes that play a role in 
the tumorigenesis process. Prominent among these is the one(s) pre- 
sumptively located at Sq21. Cytogenetic (Herrera et al., Am J. Med. 
Genet. , vol. 25, p. 473 U?96) and linkage (Leppert et ai., Science, Vol. 
238, p. 1411 (1987); Bodmer et al., Nature, Vol. 328, p. 614 (1987)) stud- 
ies have shown that this chromosome region harbors the gene 
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responsible for familial adenomatous polyposis (FA?) and Gardner's 
Syndrome (GS). FAP is an autosomal-dominant, inherited disease In 
which affected individuals develop hundreds to thousands of t 
adenomatous polyps, some of which progress to malignancy. GS is a ^ 
variant of FAP In which desmoid tumors, osteomas and other soft tissue 
tumors occur together with multiple adenomas of the colon and rec- 
tum. A less severe form of polyposis has been identified in which only 
a few (2-40) polyps develop. This condition also is familial and is linked 
to the same chromosomal markers as FAP and GS (Leppert et al. f New 
England Journal of Medicine, Vol. 322, pp. 904-908. 1990.) Additionally, 
this chromosomal region is often deleted from the adenomas 
(Vogelstein et aL. N. Engl. J. Med.. Vol. 319, p. 525 (1988)) and carcino- . 
mas (Vogelstein et ai., N. Engl. J. Med., Vol. 319, p. 525 (1988): Solomon 
et al.. Nature, Vol. 328, p. G16 (1987); Sasaki et u., Cancer Research, 
Vol. 49, p. 4402 (1989); Delattre et al.. Lancet, Vol. 2, p. 353 (1989); and 
Ashton-Rickardt et al.. Oncogene. Vol. 4 f p. 1169 (1989)) of patients 
without FAP (sporadic tumors). Thus, a putative suppressor gene on 
chromosome 5q2l appears to play a role in the early stages of 
colorectal neoplasia in both sporadic and familial tumors. 

Although the MCC gene has been identified on 5q2i as a candi- 
date suppressor gene, it does not appear to be altered in FAP or GS 
patients. Thus there is a need in the art for investigations of this chro- 
mosomal region to identify genes and to determine if any of such genes 
are associated with FAP and/or GS and the process of tumorigenesis. 
SUMMARY OF THE INVENTION 

It is an object of the present invention to provide a method for 
diagnosing and prognosing a neoplastic tissue of a human. 

It is another object of the invention to provide a method of 
detecting genetic predisposition to cancer. 

It is another object of the invention to provide a method of sup- # 
plying wild-type APC gene function to a cell which has lost said gene | 
function. i 

It is yet another object of the invention to provide a kit for 
determination of the nucleotide sequence of APC alleles by the 
polymerase chain reaction. 
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It is still another object of the invention to provide nucleic acid 
probes for detection of mutations in the human A PC gene. 

It is still another object of the invention to provide a cDNA mol- 
ecule encoding the APC gene product. 

It is yet another object of the invention to provide a preparation 
of the human APC protein. 

It is another object of the Invention to provide a method of 
screening for genetic predisposition to cancer. 

It is an object of the invention to provide methods of testing 
therapeutic agents lor the ability to suppress neoplasia. 

It is still another object of the invention to provide animals car- 
rying mutant APC alleles. 

These and other objects of the invention are provided by one or 
more of the embodiments which are described below. In one embodi- 
ment of the present invention a method of diagnosing or prognosing a 
neoplastic tissue of a human is provided comprising; detecting somatic 
alteration of wild-type APC genes or their expression products in a 
sporadic colorectal cancer tissue, said alteration indicating neoplasia of 
the tissue. 

In yet another embodiment a method is provided of detecting 
genetic predisposition to cancer in a human including familial 
adenomatous polyposis (FAP) and Gardner's Syndrome (GS), comprising: 
isolating a human sample selected from the group consisting of blood 
and fetal tissue; detecting alteration of wild-type APC gene coding 
sequences or their expression products from the sample, said alteration 
indicating genetic predisposition to cancer. 

In another embodiment of the present invention a method is 
provided for supplying wild-type APC gene function to a cell which has 
lost said gene function by virtue of a mutation in the APC gene, com- 
prising: Introducing a wild-type APC gene into a cell which has lost 
said gene function such that said wild-type gene is expressed in the 
ceil. 

In another embodiment a method of supplying wild-type APC 
gene function to a cell is provided comprising: introducing a portion of 
a wild-type APC gene into a ceil which has lost said gene function such 
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that said portion is expressed in the cell, said portion encoding a part 
of the APC protein which is required for nonneoplastic growth or said 
cell. APC protein can also be applied to cells or administered to ani- 
mals to remediate for mutant APC genes. Synthetic peptides or drugs 
can also be used to mimic APC lunction in cells which have altered 
APC expression. 

In yet another embodiment a pair or single stranded primers is 
provided for determination of the nucleotide sequence of the APC gene 
by polymerase chain reaction. The sequence of said pair of single 
stranded DNA primers is derived from chromosome Sq band 21, said 
pair of primers allowing synthesis of APC gene coding sequences. 

In still another embodiment of the invention a nucleic acid probe 
is provided which is complementary to human wild-type APC gene cod- 
ing sequences and which can form mismatches with mutant APC genes, 
thereby allowing their detection by enzymatic or chemical cleavage or 
by shifts in electrophoretic mobility. 

In another embodiment of the invention a method is provided for 
detecting the presence of a neoplastic tissue in a human. The method 
comprises Isolating a body sample from a human; detecting in said sam- 
ple alteration of a wild-type APC gene sequence or wild-type APC 
expression product, said alteration indicating the presence of a 
neoplastic tissue in the human. 

In still another embodiment a cDNA molecule is provided which 
comprises the coding sequence of the APC gene. 

In even another embodiment a preparation of the human APC 
protein is provided which is substantially free of other human proteins. 
The amino acid sequence of the protein is shown in Figure 3 or 7. 

In yet another embodiment of the invention a method is provided 
for screening for genetic predisposition to cancer, including familial 
adenomatous polyposis (FAP) and Gardner's Syndrome (GS), in a human. 
The method comprises: detecting among kindred persons the presence 
of a DNA polymorphism which is linked to a mutant APC allele in an 
individual having a genetic predisposition to cancer, said kindred being 
genetically related to the individual, the presence of said polymorphism 
suggesting a predisposition to cancer. 
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In another embodiment of the invention a method of testing 
therapeutic agents for the ability to suppress a neoplastically trans* 
formed phenotype is provided. The method comprises: applying a test 
substance to a cultured epithelial cell which carries a mutation in an 
APC allele; and determining whether said test substance suppresses 
the neoplastically transformed phenotype of the cell. 

In another embodiment of the invention a method of testing 
therapeutic agents for the ability to suppress a neoplastically trans- 
formed phenotype is provided. The method comprises: administering a 
test substance to an animal which carries a mutant APC allele: and 
determining whether said test substance prevents or suppresses the 
growth of tumors. 

In still other embodiments of the invention transgenic animals 
are provided. The animals carry a mutant APC allele from a second 
animal species or have been genetically engineered to contain an inser- 
tion mutation which disrupts an APC allele. 

The present invention provides the art with the information that 
the APC gene, a heretofore unknown gene is. in fact, a target of muta- 
tional alterations on chromosome Sq2l and that these alterations are 
associated with the process of tumorigenesis. This information allows 
highly specific assays to be performed to assess the neoplastic status of 
a particular tissue or the predisposition to cancer of an individual. This 
invention has applicability to Familial Adenomatous Polyposis, sporadic 
colorectal cancers, Gardners Syndrome, as well as the less severe 
familial polyposis discusses above. 
BRIEF DESCRIPTION 0 - THE DRAWINGS 

Figure 1A shows an overview of yeast artificial chromosome 
(YAC) contigs. Genetic distances between selected RFLP markers 
from within the contigs are shown in centiMorgans. 

Figure IB shows a detailed map of the three central contigs. 
The position of the six identified genes from within the FAP region is 
shown: the 5' and 3' ends of the transcripts from these genes have in 
general not yet been isolated, as indicated by the string of dots sur- 
rounding the bars denoting the genes 1 positions. Selected restriction 
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endonuclease recognition sites are indicated.- B, BssH2; S, SstO; 

M, MM; N, NruL 

Figure 2 shows the sequence of TBI and TB2 genes. The cDNA 
sequence of the TBI gene was determined from the analysis of II 
cDNA clones derived from normal colon and liver t as described in the 
text. A total of 2314 bp were contained within the overlapping cDS'A 
clones, defining an ORF of 424 amino acids beginning at nucleotide 1. 
Only the predicted amino acids from the ORF are shown. The 
carboxy-terminal end of the ORF has apparently been identified, but 
the 5' end of the TBI transcript has not yet been precisely determined. 

The cDNA sequence of the TB2 gene was determined from the 
YS-39 clone derived as described in the text. This clone consisted of 
2300 bp and defined an ORF of 185 amino acids beginning at nucleotide 
1. Only the predicted amino acids are shown. The carboxy terminal 
end of the ORF has apparently been identified, but the 5' end of the 
TB2 transcript has not been precisely determined. 

Figure 3 shows the sequence of the APC gene product. The 
cDNA sequence was determined through the analysis of 87 cDNA clones 
derived from normal colon, liver, and brain. A total of 8973 bp were 
contained within overlapping dONA clones, defining an ORF of 2842 
amino acids, in frame stop codons surrounded this ORF, as described in 
the text, suggesting that the entire APC gene product was represented 
in the ORF illustrated. Only the predicted amino acids are shown. 

Figure 4 shows the local similarity between human APC and ral2 
of yeast. Local similarity among the APC and MCC genes and the m3 
muscarinic acetylcholine receptor is shown. The region of the mAChR 
shown corresponds to that responsible for coupling the receptor to G 
proteins. The connecting lines indicate identities; dots indicate related 
amino acids residues. 

Figure 5 shows the genomic map of the 1200 kb Notl fragment at 
the FAP locus. The Notl fragment is shown as a bold line. Relevant 
parts of the deletion chromosomes from patients 3214 and 3824 are 
shown as stippled lines. Probes used to characterize the Notl fragment 
and the deletions, and three YACs from which subclones were obtained, 
are shown below the restriction map. The chimeric end of YAC 
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183H12 is indicated by a dotted line. The orientation and approximate 
position of MCC are indicated above the map. 

Figure 6 shows the DNA sequence and predicted amino acid 
sequence of DPI (TB2). The nucleotide numbering begins at the most 5' 
nucleotide isolated. A proposed initiation methionine (base 77) is indi- 
cated in bold type. The entire coding sequence is presented. 

Figure 7 snows the cDNA and predicted amino acid sequence of 
DP2.5 (APC). The nucleotide numbering begins at the proposed initia- 
tion methionine. The nucleotides and amino acids of the alternatively 
spliced exon (exon 9; nucleotide positions 934*1236) are presented in 
lower case letters. At the 3* end 9 a poly(A) addition signal occurs at 
9530, and one cDNA clone has a poly(A) at 9563. Other cDNA clones 
extend beyond 9563, however, and their consensus sequence is included 
here. 

Figure 8 shows the arrangement of exons in DP2.5 (APC). 
(A) Exon 9 corresponds to nucleotides 933-1312; exon 9a corresponds to 
nucleotides 1236-1312. The stop codon in the cDNA is at nucleotide 
8535. (B) Partial intronic sequence surrounding each exon is shown, 
DETAILED DESCRIPTION 

It is a discovery of the present invention that mutational events 
associated with tumorigenesis occur in a previously unknown gene on 
chromosome 5q named here the APC (Adenomatous Polyposis Coll) 
gene. Although it was previously known that deletion of alleles on 
chromosome 5q were common In certain types of cancers, it was not 
known that a target gene of these deletions was the APC gene. Fur- 
ther it was not known that other types of mutational events in the APC 
gene are also associated with cancers. The mutations of the APC gene 
can involve gross rearrangements, such as insertions and deletions. 
Point mutations have also been observed. 

According to the diagnostic and prognostic method of the 
present invention, alteration of the wild-type APC gene is detected. 
"Alteration of a wild-type gene" according to the present invention 
encompasses all forms of mutations - including deletions. The alter- 
ation may be due to either rearrangements such as insertions, inver- 
sions, and deletions, or to point mutations. Deletions may be of the 
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entire gene or only a portion of the gene. Somatic mutations are those 
which occur only in certain tissues, e.g., in the tumor tissue, and are 
not inherited in the germline. Germline mutations can be found in any 
of a body's tissues. If only a single allele is somatically mutated, an I 
early neoplastic state Is indicated. However, if both alleles are « 
mutated then a late neoplastic state is indicated. The finding of A PC 
mutations thus provides both diagnostic and prognostic information. 
An APC allele which is not deleted (e.g., that on the sister chromosome 
to a chromosome carrying an APC deletion) can be screened for other 
mutations, such as insertions, small deletions, and point mutations. It 
is believed that many mutations found In tumor tissues will be those 
leading to decreased expression of the APC gene product. However, . 
mutations leading to non-functional gene products would also lead to a 
cancerous state. Point mutational events may occur in regulatory 
regions, such as in the promoter of the gene, leading to loss or diminu- 
tion of expression of the mRNA* Point mutations may also abolish 
proper RNA processing, leading to loss of expression of the APC gene 
product* 

In order to detect the alteration of the wild-type APC gene in a 
tissue, it is helpful to isolate the tissue free from surrounding normal 
tissues. Means for enriching a tissue preparation for tumor cells are 
known in the art. For example, the tissue may be isolated from paraf- 
fin or cryostat sections. Cancer cells may also be separated from nor* 
mal cells by flow cytometry. These as well as other techniques for 
separating tumor from normal cells are well known in the art. If the 
tumor tissue is highly contaminated with normal cells, detection of 
mutations is more difficult. 

Detection of point mutations may be accomplished by molecular 
cloning of the APC allele (or alleles) and sequencing that allele(s) using 
techniques well known in the art. Alternatively, the polymerase chain f 
reaction (PCR) can be used to amplify gene sequences directly from a ^ 
genomic DNA preparation from the tumor tissue. The DNA sequence < 
of the amplified sequences can then be determined. The polymerase 
chain reaction itself is well known in the art. See, e.g., Saiki et al. f 
Science, Vol. 239, p. 487, 1988; U.S. 4,633,203; and U.S. 4,663,195. 
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Specific primers which can be used in order to amplify the gene win 
be discussed in more detail below. The llgase chain reaction, which is 
known in the art, can also be used to amplify APC sequences. See Ku 
et al. f Genomics . Vol. 4, pp. 560*569 (1989). In addition, a technique 
Known as allele specific PCR can be used. (See Ruano and Kidd, 
Nucleic Acids Research, Vol. 17, p. 8392, 1989.) According to this 
technique, primers are used which hybridize at their 3 T ends to a par- 
ticular APC mutation. If the particular APC mutation is not present, 
an amplification product is not observed. Amplification Refractory 
Mutation System (ARMS) can also be used as disclosed in European 
Patent Application Publication No. 0332435 and in Newton et ah. 
Nucleic Acids Research, Vol. 17, p.7, 1989. Insertions and deletions of 
genes can also be detected by cloning, sequencing and amplification. In 
addition, restriction fragment length polymorphism (RFLP) probes for 
the gene or surrounding marker genes can be used to score alteration 
of an allele or an insertion in a polymorphic fragment. Such a method 
is particularly useful for screening among kindred persons of an 
affected individual for the presence of the APC mutation found in that 
individual. Single stranded conformation polymorphism (SSCP) analysis 
can also be used to detect base change variants of an allele. (Orita et 
aL, Proc. Natl. Acad. ScL USA Vol. 86, pp. 2766-2770, 1989, and 
Genomics, Vol. 5, pp. 874-879, 1989.) Other techniques for detecting 
Insertions and deletions as are known in the art can be used. 

Alteration of wild-type genes can also be detected on the basis 
of the alteration of a wild-type expression product of the gene. Such 
expression products include both the APC mRNA as well as the APC 
protein product. The sequences of these products are shown in 
Figures 3 and 7. Point mutations may be detected by amplifying and 
sequencing the mRNA or via molecular cloning of cDNA made from the 
mRNA. The sequence of the cloned cDNA can be determined using 
DNA sequencing techniques which are well known in the art. The 
cDNA can also be sequenced via the polymerase chain reaction (PCR) 
which will be discussed in more detail below. 

Mismatches, according to the present invention are hybridized 
nucleic acid duplexes which are not 100% homologous. The lack of 
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total homology may be due to deletions, insertions, inversions, substitu- 
tions or frameshift mutations. Mismatch detection can be used to 
detect point mutations in the gene or its mRNA product. While these 
techniques are less sensitive than sequencing, they are simpler to per- f 
form on a large number of tumor samples. An example of a mismatch « 
cleavage technique is the RNase protection method, which is described 
In detail in Winter et al., Proc. NatL Acad. Sci. USA, Vol. 82 t p. 7575, 
1985 and Meyers et aL, Science, VoL 230, p. 1242, 1985. In the practice 
of the present invention the method involves the use of a labeled 
riboprobe which is complementary to the human wild-type APC gene 
coding sequence. The riboprobe and either mRNA or DNA isolated 
from the tumor tissue are annealed (hybridized) together and subse- 
quently digested with the enzyme RNase A which is able to detect 
some mismatches in a duplex RNA structure. If a mismatch is detected 
by RNase A, it cleaves at the site of the mismatch. Thus, when the 
annealed RNA preparation is separated on an electrophoretic gel 
matrix, if a mismatch has been detected and cleaved by RNase A, an 
RNA product will be seen which is smaller than the full-length duplex 
RNA for the riboprobe and the mRNA or DNA. The riboprobe need not 
be the full length of the APC mRNA or gene but can be a segment of 
either. If the riboprobe comprises only a segment of the APC mRNA or 
gene it will be desirable to use a number of these probes to screen the 
whole mRNA sequence for mismatches. 

In similar fashion, DNA probes can be used to detect mis- 
matches, through enzymatic or chemical cleavage. See, e.g., Cotton et 
al., Proc. Natl. Acad. Sci. USA, vol. 85, 4397, 1988; and Shenk et al.« 
Proc. Natl. Acad. Sci. USA, Vol. 72, p. 989, 1975. Alternatively, mis- 
matches can be detected by shifts in the electrophoretic mobility of 
mismatched duplexes relative to matched duplexes. See, e.g., Cariello, 
Human Genetics, Vol. 42, p. 726, 1988. With either riboprobes or DNA , 
probes, the cellular mRNA or DNA which might contain a mutation can 
be amplified using PGR (see below) before hybridization. Changes in 4 
DNA of the APC gene can also be detected using Southern hybridiza- 
tion, especially if the changes are gross rearrangements, such as dele- 
tions and insertions. 
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DNA sequences of the A PC gene whictrhave been amplified by 
use ol polymerase chain reaction may also be screened using allele-spe- 
clfic probes. These probes are nucleic acid oligomers, each of which 
J contains a region of the APC gene sequence harboring a known muta- 

tion. For example, one oligomer may be about 30 nucleotides in length, 
corresponding to a portion of the APC gene sequence. By use of a bat- 
tery of such allele-specific probes, PCR amplification products can be 
screened to Identify the presence of a previously identified mutation in 
the APC gene. Hybridization of ailele-specific probes with amplified 
APC sequences can be performed, for example, on a nylon filter. 
Hybridization to a particular probe under stringent hybridization condi- 
tions Indicates the presence of the same mutation in the tumor tissue 
as in the allele-specific probe. 

Alteration of APC roRNA expression can be detected by any 
technique known in the art. These include Northern blot analysis, PCR 
amplification and RNase protection. Diminished mRNA expression 
indicates an alteration of the wild-type APC gene* 

Alteration of wild-type APC genes can also be detected by 
screening for alteration of wild-type APC protein. For example, 
monoclonal antibodies immunoreactive with APC can be used to screen 
a tissue. Lack of cognate antigen would Indicate an APC mutation. 
Antibodies specific for products of mutant alleles could also be used to 
detect mutant APC gene product. Such immunological assays can be 
done in any convenient format known in the art. These include West- 
ern blots, immunohistochemlcal assays and EL1SA assays. Any means 
for detecting an altered APC protein can be used to detect alteration 
of wild-type APC genes. Functional assays can be used, such as protein 
binding determinations. For example, it is believed that APC protein 
oiigomerizes to Itself and/or MCC protein or binds to e G protein. 
Thus, an assay for the ability to bind to wild type APC or MCC protein 
or that G protein can be employed. In addition, assays can be used 
which detect APC biochemical function. It is believed that APC is 
involved in phospholipid metabolism. Thus, assaying the enzymatic 
predicts of the involved phospholipid metabolic pathway can be used to 
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determine APC activity. Finding a mutant APC gene product indicates 
alteration of a wild-type APC gene. 

Mutant APC genes or gene products can also be detected in 
other human body samples, such as, serum, stool, urine and sputum. 
The same techniques discussed above for detection of mutant APC 
genes or gene products in tissues can be applied to other body samples. 
Cancer cells are sloughed off from tumors and appear in such body 
samples. In addition, the APC gene product itself may be secreted Into 
the extracellular space and found in these body samples even In the 
absence of cancer cells. By screening such body samples, a simple 
early diagnosis can be achieved for many types of cancers. In addition, 
the progress of chemotherapy or radiotherapy can be monitored more 
easily by testing such body samples for mutant APC genes or gene 
products. 

The methods of diagnosis of the present invention are applicable 
to any tumor In which APC has a role in tumorigenesis. Deletions of 
chromosome arm Sq have been observed in tumors of lung, breast, 
colon, rectum, bladder, liver, sarcomas, stomach and prostate, as well 
as in leulcemias and lymphomas. . Thus these are likely to be tumors In 
which APC has a role. The diagnostic method of the present invention 
is useful for clinicians so that they can decide upon an appropriate 
course of treatment. For example, a rumor displaying alteration of 
both APC alleles might suggest a more aggressive therapeutic regimen 
than a tumor displaying alteration of only one APC allele. 

The primer pairs of the present invention are useful for determi- 
nation of the nucleotide sequence of a particular APC allele using the 
polymerase chain reaction. The pain of single stranded DNA primers 
can be annealed to sequences within or surrounding the APC gene on 
chromosome 3q in order to prime amplifying OKA synthesis of the APC 
gene itself. A complete set of these primers allows synthesis of all of 
the nucleotides of the APC gene coding sequences, i.e., the exons. The 
set of primers preferably allows synthesis of both intron and exon 
sequences. Allele specific primers can also be used. Sucn primers 
anneal only to particular APC mutant alleles, and thus will only amplify 
a product in the presence of the mutant allele as a template. 
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In order to facilitate subsequent cloning of amplified sequences, 
primers may have restriction enzyme site sequences appended to their 
5' ends. Thus, all nucleotides of the primers are derived from APC 
sequences or sequences adjacent to APC except the few nucleotides 
necessary to form a restriction enzyme site. Such enzymes and sites 
ere well known in the art. The primers themselves can be synthesized 
using techniques which are well known in the art. Generally, the prim- 
ers can be made using oligonucleotide synthesizing machines which are 
commercial^ available. Given the sequence of the APC open reading 
frame shown in Figure 7, design of particular primers is well within the 
skill of the art. 

The nucleic acid probes provided by the present invention are 
useful for a number of purposes. They can be used in Southern hybrid* 
ization to genomic DNA and in the RNase protection method for 
detecting point mutations already discussed above. The probes can be 
used to detect PCR amplification products. They may also be used to 
detect mismatches with the APC gene or mRNA. using other tech- 
niques. Mismatches can be detected using either enzymes (e.g„ Si 
nuclease), chemicals <e.g M hydroxyiamine or osmium tetroxide and 
piperidine), or changes in electrophoretic mobility of mismatched 
hybrids as compared to totally matched hybrids. These techniques are 
known in the art. See, Cotton, supra . Shenk, supra . Myers, supra . Win- 
ter, suE£i, and Novack et al., Proc. Natl. Acad. Sci. USA, Vol. 83, p. 
386, 1986. Generally, the probes are complementary to APC gene cod- 
ing sequences, although probes to certain introns are also contem- 
plated. An entire battery of nucleic acid f; t>bes is used to compose a 
kit for detecting alteration of wild-type APC genes. The kit allows for 
hybridization to the entire APC gene. The probes may overlap with 
each other or be contiguous. 

If a riboprobe is used to detect mismatches with mRNA, it is 
complementary to the mRNA of the human wild- type APC gene. The 
riboprobe thus is an anti-sense probe in that it does not code for the 
APC protein because it is of the opposite polarity to the sense strand. 
The riboprobe generally will be labeled with a radioactive, 
colorimetric, or fluorometric material, which can be accomplished by 
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any means known in the art. If the riDoprobe is used to detect mis- 
matches with DNA It can be of either polarity, sense or anti-sense. 
Similarly, DNA probes also may be used to detect mismatches. 

Nucleic add probes may also be complementary to mutant 
alleles of the APC gene. These are useful to detect similar mutations 
in omer patients on the basis of hybridization rather than mismatches. 
These are discussed above and referred to as allele-specific probes. As 
mentioned above, the APC probes can also be used in Southern hybrid- 
izations to genomic DNA to detect gross chromosomal changes such as 
deletions and Insertions. The probes can also be used to select cDNA 
clones of APC genes from tumor and normal tissues. In addition, the 
probes can be used to detect APC mRNA in tissues to determine if 
expression is diminished as a result of alteration of wUd-type APC 
genes. Provided with the APC coding sequence shown In Figure 7 (SEQ 
ID NO: 1), design of particular probes is well within the skill of the 
ordinary artisan. 

According to the present invention a method is aiso provided of 
supplying wild-type APC function to a ceil which carries mutant APC 
alleles. Supplying such function should suppress neoplastic growth of 
the recipient cells. The wild-type APC gene or a part of the gene may 
be introduced into the cell in a vector such that the gene remains 
extrachromosomal. in such a situation the gene will be expressed by 
the cell from the extrachromosomal location. If a gene portion is 
Introduced and expressed in a cell carrying a mutant APC allele, the 
gene portion should encode a part of the APC protein which is required 
for non-neoplastic growth of the cell. More preferred is the situation 
where the wild-type APC gene or a part of it Is introduced into the 
mutant cell in such a way that it recombines with the endogenous 
mutant APC gene present in the cell. Such recombination requires a 
double recombination event which results in the correction of the APC 
gene mutation. Vectors for introduction of genes both for recombina- 
tion and Tor extrachromosomal maintenance are known in the art and 
any suitable vector may be used. Methods for introducing DNA into 
cells such as electroporation, calcium phosphate co-precipitation and 
viral transduction are known In the art and the choice of method is 
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within the competence of the routineer. Ceils transformed with the 
wild-type APC gene can be used as model systems to study cancer 
remission and drug treatments which promote such remission. 

Similarly, cells and animals which carry a mutant APC allele can 
be used as model systems to study and test for substances which have 
potential as therapeutic agents. The cells are typically cultured 
epithelial cells. These may be isolated from individuals with APC 
mutations, either somatic or germline. Alternatively, the cell line can 
be engineered to carry the mutation in the APC allele. After a test 
substance is applied to the cells, the neoplastically transformed pheno- 
type of the cell will be determined. Any trait of neoplastically trans- 
formed cells can be assessed, including anchorage-Independent growth,- 
tumorigeniclty In nude mice, invasiveness of cells, and growth factor 
dependence. Assays for each of those traits are known in the art. 

Animals for testing therapeutic agents can be selected after 
mutagenesis of whole animals or after treatment of germline cells or 
zygotes. Such treatments include insertion of mutant APC alleles, usu- 
ally from a second animal species, as well as insertion of disrupted 
homologous genes. Alternatively, the endogenous APC genefs) of the 
animals may be disrupted by insertion or deletion mutation. After test 
substances have been administered to the animals, the growth of 
tumors must be assessed. If the test substance prevents or suppresses 
the growth of tumors, then the test substance is a candidate therapeu- 
tic agent for the treatment of FAP and/or sporadic cancers. 

Polypeptides which have APC activity can be supplied to cells 
which carry mutant or missing APC alleles. The sequence of the APC 
protein is disclosed in Figure 3 or 7 (SEQ ID NO: -7 or 1). These two 
sequences differ slightly and appear to be indicate the existence of two 
different forms of the APC protein. Protein can be produced by 
expression of the cDNA sequence in bacteria, for example, using known 
expression vectors. Alternatively, APC can be extracted from A PC- 
producing mammalian cells such as brain cells. In addition, the tech- 
niques of synthetic chemistry can be employed to synthesize APC pro- 
tein. Any of such techniques can provide the preparation of the 
present invention which comprises the APC protein. The preparation 



WO 92/13103 



-16- 



PCT/USW/00376 



Is substantial free of other human proteins. This is most readily 
accomplished by synthesis in a microorganism or in vitro. 

Active APC molecules can be introduced into cells by 
microinjection or by use of liposomes, for example- Alternatively, 
some such active molecules may be taken up by cells, actively or by 
diffusion. Extracellular application of APC gene product may be suffi- 
cient to affect tumor growth. Supply o f Mecules with APC activity 
should lead to a partial reversal of the neoplastic state. Other mole- 
cules with APC activity may also be used to effect such a reversal, for 
example peptides, drugs, or organic compounds. 

The present invention also provides a preparation of antibodies 
immunoreactive with a human APC protein. The antibodies may be 
polyclonal or monoclonal and may be raised against native APC pro- 
tein, APC fusion proteins, or mutant APC proteins. The antibodies 
should be immunoreactive with APC epitopes, preferably epitopes not 
present on other human proteins. In a preferred embodiment of the 
invention the antibodies will immunopreetpitate APC proteins from 
solution as well as react with APC protein on Western or immunoblots 
of polyacrylamide gels. In another preferred embodiment, the antibod- 
ies will detect APC proteins in paraffin or frozen tissue sections, using 
immunocytochemical techniques. Techniques for raising and purifying 
antibodies are well known in the art and any such techniques may be 
chosen to achieve the preparation of the invention. 

Predisposition to cancers as In FAP and GS can be ascertained 
by testing any tissue of a human for mutations of the APC gene. For 
example, a person who has inherited a germline APC mutation would be 
prone to develop cancers. This can be determined by testing DNA from 
any tissue of the person's body. Most simply, blood can be drawn and 
DNA extracted from the cells of the blood. In addition, prenatal diag- 
nosis can be accomplished by testing fetal cells, placental cells, or 
amniotic fluid for mutations of the APC gene. Alteration of a wild- 
type APC allele, whether for example, by point mutation or by dele- 
tion, can be detected by any of the means discussed above. 

Molecules of cDNA according to the present invention are 
intron-free, APC gene coding molecules. They can be made by reverse 



WO 92/13103 



-17- 



PCT/\jS92/003l<> 



transcriptase using the APC mRNA as a template. These molecuJes 
can be propagated in vectors and cell lines as is known in the art. Such 
molecules have the sequence shown in SEQ ID NO: 7. The cDNA can 
also be made using the techniques of synthetic chemistry given the 
sequence disclosed herein. 

A short region of homology has been identified between APC and 
the human m3 muscarinic acetylcholine receptor (mAChR). This 
homology was largely confined to 29 residues in which 6 out of 7 amino 
acids (EL(GorA)GLQA) were ide tal (See Figure 4). Initially, It was 
not known whether this homology s significant, because many other 
proteins had higher levels of global homology (though few had six out of 
seven contiguous amino acids in common). However, a study on the . 
sequence elements controlling G protein activation by mAChR subtypes 
(Lechleiter et al., EMBO J., p. 4381 (1990)) has shown that a 21 amino 
acid region from the mS mAChR completely mediated G protein speci- 
ficity when substituted for the 21 amino acids of m2 mAChR at the 
analogous protein position. These 21 residues overlap the 19 amino acid 
homology between APC and m3 mAChR. 

This connection between APC and the G protein activating 
region of mAChR is intriguing in light of previous investigations relat- 
ing G proteins to cancer. For example, the RAS oncogenes, which are 
often mutated in colorectal cancers (Vogelstein, et al., N. Engl. J. 
Med., Vol* 319, p. 52S (1988); Bos et al M Nature Vol. 327, p. 293 (1987)), 
are members of the G protein family (Bourne, et al.. Nature, Vol. 348, 
p. 125 (1990)) as is an in vitro transformation suppressor (Noda et al., 
Proc. Natl. Acad. £ Ji. USA, Vol. 66, p. 162 (1989)) and genes mutated in 
hormone producing tumors (Candis et al., Nature, Vol. 340, p. 692 
(1989); Lyons et al M Science, Vol. 249, p. 655 (1990)). Additionally, the 
gene responsible for neurofibromatosis (presumably a tumor suppressor 
gene) has been shown to activate the GTPase activity of RAS (Xu et al., 
Cell, Vol. 63, p. 835 (1990); Martin et al., Cell, Vol. 63, p. 843 (1990); 
Ballester et al., Cell, Vol. 63 f p. 851 (1990)). Another interesting link 
between G proteins and colon cancer involves the drug sulindac. This 
agent has been shown to inhibit the growth of benign colon tumors in 
patients with FAP, presumably by virtue of its activity as a 
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cyciooxygenase Inhibitor (Waddell et al., J. Surg. Oncology 24(1). 83 
(1983); Wadeli, et al. ( Am. J. Surg., 157{l) t 175 (1989); Charneau et a]., 
Castroenterologie Clinique at Biologique 14(2), 153 (1990)). 
Cyclooxygenase is required to convert arachidonic acid to 
prostaglandins and other biologically active molecules. 0 proteins are 
known to regulate phospholipase A2 activity, which generates 
arachidonic acid from phospholipids (Role et al., Proc. Natl. Acad. ScL 
USA, Vol. 84, p. 3623 (1987); Kurachl et ah, Nature, VoL 337, 12 555 
(1989)). Therefore we propose that wD'-type APC protein functions by 
interacting with a G protein and involved in phospholipid 
metabolism. 

The following are provided for exemplification purposes only and 
are not Intended to limit the scope of the invention which has been 
described in broad terms above. 
Example 1 : 

This example demonstrates the isolation of a 5.5 Mb region of 
human DNA linked to the FAP locus. Six genes are identified in this 
region t all of which are expressed in normal colon cells and in 
colorectal, lung, ad bladder tumors. 

The cosmid markers TN5.64 and YN5.48 have previously been 
Shown to delimit an 8 cM region containing the locus for FAP 
(Nakamura et aL, Am. J. Hum. Genet. Vol. 43, p. 636 (1988)). Further 
linkage and pulse-field gel electrophoresis (PFGE) analysis with addi- 
tional markers has shown that the FAP locus is contained within a 4 cM 
region bordered by cosmids EF5.44 and L5.99. In order to isolate clones 
representing a significant portion of this locus, a yeast artificial chro- 
mosome (YAC) library was screened with various 5q21 markers. 
Twenty-one YAC clones, distributed within six contigs and including 
5.5 Mb from the region between YK5.64 and YN5.48, were obtained 
(Figure IA). 

Three contigs encompassing approximately 4Mb were contained 
within the central portion of this region. The YAC'S constituting these 
contigs, together with the markers used for their isolation and orienta- 
tions, are shown in Figure 1. These YAC contigs were obtained in the 
following way. To initiate each contig, the sequence of a genomic 
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marker cloned from chromosome 5q2l was determined and used to 
design primers for PCR. PCR was then carried out on pools of YAC 
clones distributed in microtiter trays as previously described (Anand 
et al. t Nucleic Acids Research, Vol. 18, p. 1951 (1980)). Individual YAC 
clones from the positive pools were identified by further PCR or 
hybridization based assays, and the YAC sizes were determined by 
PFGE. 

To extend the areas covered by the original YAC clones, "chro- 
mosomal walking M was performed. For this purpose, YAC termini were 
isolated by a PCR based method and sequenced (Riley et al M Nucleic 
Acids Research, Vol. 18, p. 2887 (1990)). PCR primers based on these 
sequences were then used to rescreen the YAC library. For example, 
the sequence from an intron of the FER gene (Hao et al., Mol. Cell. 
BioL, Vol. 9, p. 1587 (1989)) was used to design PCR primers for isola- 
tion of the 28EC1 and 5EH8 YACs. The termini of the 28EC1 YAC 
were sequenced to derive markers RHE28 and LHE28, respectively. 
The sequences of these two markers were then used to isolate YAC 
clones 15CH12 (from RHE28) and 40CF1 and 29EF1 (from LHE28). 
These five YACs formed a contig encompassing 1200 kb (contig 1, 
Figure IB). 

Similarly, contig 2 was initiated using cosmid No. 66 sequences, 
and contig 3 was initiated using sequences both from the MCC gene and 
from cosmid EF5.44. A walk in the telomeric direction from YAC 
14FH1 and a walk in the opposite direction from YAC 39GG3 allowed 
connection of the Initial contig 3 clones through YAC 37HC4 
(Figure IB). 

Multipoint linkage analysis with the various markers used to 
define the contigs, combined with PFGZ analysis, showed that contigs l 
and 2 were centromeric to contig 3. These contigs were used as tools 
to orient and/or identify penes which might be responsible for FA P. 
Six genes were found to lie within this cluster of YACs, as follows: 

Contig #1: FER - The FER gene was discovered through its 
homology to the viral oncogene ABL (Hao et al., supra ), it has an 
intrinsic tyrosine kinase activity, and in situ hybridization with an FER 
probe showed that the gene was located at 5q 11-23 (Morris et al M 
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Cytogenet. Ceil. Genet., Vol. 53, p. 4, (1990)). Because of the potential 
role or this oncogene-reiated gene in neoplasia, we decided to evaluate 
it further with regards to the FAP locus. A human genomic clone from 
FER was isolated (MF 2.3) and used to define a restriction fragment 
length polymorphism (RFLP), and the RFLP in turn used to map FER by 
linkage analysis using a panel of three generation families. This 
showed that FER was very tightly linked to previously defined 
polymorphic markers for the FAP locus. The genetic mapping of FER 
was complemented by physical mapping using the YAC clones derived 
from FER sequences (Figure IB). Analysis of YAC contig 1 showed that 
FER was within 600 kb of cosmid marker M5.28, which maps to within 
1.5 Mb of cosmid L5.99 by PFCE of human genomic DNA. Thus, the 
YAC mapping results were consistent with the FER linkage data and 
PFCE analyses. 

Contig 2: TBI - TBI was Identified through a cross-hybridization 
approach. Exons of genes are often evolutionartly conserved while 
lntrons and intergenic regions are much less conserved. Thus, if a 
human probe cross-hybridizes strongly to the OKA from non-primate 
species, there is a reasonable chance that it contains exon sequences. 
Subclones of the cosmids shown in Figure l were used to screen South- 
ern blots containing rodent OKA samples. A subclone of cosmid K5.66 
(p 5.66-4) was shown to strongly hybridize to rodent DNA, and this 
clone was used to screen cOKA libraries derived from normal adult 
colon and fetal liver. The ends ol the Initial cOKA clones obtained in 
this screen were then used to extend the cOKA sequence. Eventually, 
11 cDKA clones were isolated, covering 2314 bp. The gene detected by 
these clones was named TBI. Sequence analysis of the overlapping 
clones revealed an open reading frame (ORF) that extended for 1302 bp 
starting from the most 5* sequence data obtained (Figure 2A). If this 
entire open reading frame were translated, it would encode 434 amino 
acids. The product of this gene was not globally homologous to any 
other sequence in the current database but showed two significant local 
similarities to a family oi AOP, ATP carrier/translocator proteins and 
mitochondrial brown fat uncoupling proteins which are widely distrib- 
uted from yeast to mammals. These conserved regions of TBI 
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(underlined in Figure 2A) may define a predictive motif for this 
sequence family, in addition, TBI appeared to contain a signal peptide 
(or mitochondrial targeting sequence) as well as at least 7 
transmembrane domains. 

Contig 3: MCC, TB2, SRP and APC - The MCC gene was also 
discovered through a cross-hybridization approach, as described previ- 
ously (Kinzler et aU Science VoL 251, p. 1366 (1991)). The MCC gene 
was considered a candidate for causing FAP by virtue of Its tight 
genetic linkage to FAP susceptibility and its somatic mutation in spo- 
radic colorectal carcinomas. However, mapping experiments suggested 
that the coding region of MCC was approximately 50 kb proximal to 
the centromeric end of a 200 kb deletion found in an FAP patient. . 
MCC eDNA probes detected a 10 kb mRNA transcript on Northern blot 
analysis of which 4151 bp, including the entire open reading frame, 
have been cloned. Although the V non-translated portion or an alter* 
natively spliced form of MCC might have extended into this deletion, it 
was possible that the deletion did not affect the MCC gene product. 
We therefore used MCC sequences to Initiate a YAC contig, and subse- 
quently used the YAC clones to identify genes 50 to 250 kb distal to 
MCC that might be contained within the deletion. 

In & first approach, the insert from YAC24ED6 (Figure IB) was 
radioiabelled and hybridized to a cDN A library from normal colon. One 
of the cDN A clones (YS39) identified in this manner detected a 3.1 kb 
mRNA transcript when used as a probe for Northern blot hybridization. 
Sequence analysis of the YSS9 clone revealed that it encompassed 2263 
nucleotides and contained an ORF that extended for 555 bp from the 
most 5' sequence data obtained. If all of this ORF were translated, It 
would encode 185 amino acids (Figure 2B). The gene detected by YS39 
was named TB2. Searches of nucleotide and protein databases revealed 
that the TB2 gene was not identical to any previously reported 
sequences nor were there any striking similarities. 

Another clone (YSU) identified through the YAC 24ED6 screen 
appeared to contain portions of two distinct genes. Sequences from 
one end of YSU were identical to at least 180 bp of the signal recogni- 
tion particle protein SRP19 (Lingelbach et al. Nucleic Acids Research, 
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Vol. 16, p. 9431 (1988)- A second ORF, from the opposite end of clone 
YSll, proved to be identical to 78 bp of a novel gene which was inde- 
pendently identified through a second YAC-based approach. For the 
latter, DNA from yeast cells containing YAC 14FH1 (Figure IB) was 
digested with EcoRI and subcloned into a plasmid vector. Plasmids that 
contained human DNA fragments were selected by colony hybridization 
using total human DNA as a probe. These clones were then used to 
search for cross-hybridizing sequences as described above for TBI, and 
the cross-nybridizing clones were subsequently used to screen cDNA 
libraries. One of the cDNA clones discovered in this way (FH38) con- 
tained a long ORF (2496 bp), 78 bp of which were identical to the 
above-noted sequences in YSll. The ends of the FH38 cDNA clone 
were then used to Initiate cDNA walking to extend the sequence. 
Eventually, 85 cDNA clones were isolated from normal colon, brain and 
liver cDNA libraries and found to encompass 8973 nucleotides of con- 
tiguous transcript. The gene . corresponding to this transcript was 
named APC. When used as probes for Northern blot analysis, A PC 
cDNA clones hybridized to a single transcript of approximately 9.5 kb, 
suggesting that the great majority of the gene product was represented 
in the cDNA clones obtained. Sequences from the 5' end of the APC 
gene were found in YAC 37HG4 but not in YAC 14FH1. However, the 
3' end of the APC ^ene was found in 14FH1 as well as 37HG4. The 
yeast artificial chromosome of the present invention designated 
YAC 37HG4 has been deposited with the National Collection of Indus- 
trial and Marine Bacteria (NCIMB), P.O. Box 31, 135 Abbey Road, 
Aberdeen AB9 8DG, Scotland, prior to the filing of this patent applica- 
tion. The NCIMB Accession Number of YAC clone YAC 37HG4 is 
40353. Analogously, the 5' end of the MCC coding region was found in 
YAC clones 19AA9 and 26GC3 but not 24ED6 or 14FH1, while the 3' 
end displayed the opposite pattern. Thus, MCC and APC transcription 
units pointed in opposite directions, with the direction of transcription 
going from centromeric to telomeric in the case of MCC, and telomeric 
to centromeric in the case of APC. PFGE analysis of YAC DNA 
digested with various restriction endonucleases showed that TB2 and 
SRP were between MCC and APC, and that the 3' ends of the coding 
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regions of MCC and A PC were separated by approximately 150 kb 
(Figure IB). 

Sequence analysis of the APC cDNA clones revealed an open 
reading frame of 8,535 nucleotides. The 5 f end of the ORF contained a 
methionine codon (codon 1) that was preceded by an in-frame stop 
codon 9 bp upstream, and the 3' end was followed by several in-frame 
stop codons. The protein produced by initiation at codon 1 would con- 
tain 2,842 amino acids (Figure 3). The results of database searching 
with the APC gene product were quite complex due to the presence of 
large segments with locally biased amino acid compositions. In spite of 
this, APC could be roughly divided into two domains. The N-terminal 
25% of the protein had a high content of leucine residues (12%) and 
showed local sequence similarities to myosins, various intermediate 
filament proteins (e.g., desmin, vimentin, neurofilaments) and 
Drosophila armadillo/human plakoglobin. The latter protein is a com- 
ponent of adhesive junctions (desmosomes) Joining epithelial cells 
(Franke et al. f Proc. Natl. Acad. Sci. U.S.A., Vol. 86, p. 402? (1989); 
Perfer et al M Cell, Vol. 63, p. 1167 (1990)) The C-terminal 75% of APC 
(residues 731-2832) is 17% serine by composition with serine residues 
more or less uniformly distributed. This large domain also contains 
local concentrations of charged (mostly acidic) and proline residues. 
There was no indication of potential signal peptides, transmembrane 
regions, or nuclear targeting signals in APC suggesting a cytoplasmic 
localization. 

To detect short similarities to APC, a database search was per- 
formed using the PAM-40 matrix (Altschul. J. Mol. Bio., Vol. 219, p. 555 
(1991). Potentially interesting matches to several proteins were found. 
The most suggestive of these involved the ral2 gene product of yeast, 
which is implicated in the regulation of ras activity (Fukul et al., Mol. 
Cell. Biol., Vol. 9, p. 5617 (1989)). Little is known about how ra!2 might 
interact with ras but it is interesting to note the positively-charged 
character of this region in the context of the negatively-charged GAP 
interaction region of ras. A specific electrostatic interaction between 
ras and GAP-related proteins has been proposed. 
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Because of the proximity of the MCC and APC genes, and the 
fact that both are Implicated in colorectal tumorigenesis, we searched 
for similarities between the two predicted proteins. Bourne has previ- 
ously noted that MCC has the potential to form alpha helical -coiled 
coils (Nature, Vol. 351, p. 188 (1991). Lupas and colleagues have 
recently developed a program for predicting coiled coil potential from 
primary sequence data (Science, Vol. 252, p. 1162 (1991) and we have 
used their program to analyze both MCC and APC. Analysis of MCC 
indicated a discontinuous pattern of coiled-coil domains separated by 
putative -hinge" or "spacer" regions similar to those seen In laminin 
and other intermediate filament proteins. Analysis of the APC 
sequence revealed two regions in the N-terminal domain which had 
strong coiled coil-forming potential, and these regions corresponded to 
those that showed local similarities with myosin and IF proteins on 
database searching. In addition, one other putative coiled coil region 
was identified in the central region of APC. The potential for both 
APC and MCC to form coiled coils is interesting in that such structures 
often mediate homo- and hetero-oligomerization. 

Finally, it had previously been noted that MCC shared a short 
similarity with the region of the m3 muscarinic acetylcholine receptor 
(mAChR) Jcnown to regulate specificity of G-protein coupling. The 
APC gene also contained a local similarity to the region of the m3 
mAChR that overlapped with the MCC similarity (Figure 4B). Although 
the similarities to ral2 (Figure 4A) and m3 mAChR (Figure 4B) were not 
statistically significant, they were intriguing in light of previous obser- 
vations relating G-proteins to neoplasia. 

Each of the six genes described above was expressed in normal 
colon mucosa, as Indicated by their representation in colon cDNA 
libraries. To study expression of the genes in neoplastic colorectal 
epithelium, we employed reverse transcription-polymerase chain reac- 
tion (PCR) assays. Primers based on the sequences of FER, TBI, TB2, 
MCC, and APC were each used to design primers lor PCR performed 
with cDNA templates. Each of these genes was found to be expressed 
in normal colon, in each of ten cell lines derived from colorectal can- 
cers, and in tumor cell lines derived from lung and bladder tumors. The 
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ten colorectal cancer ceil lines included eight from patients with spo- 
radic CRC and two from patients with FAP. 
Example 2 

This example demonstrates a genetic analysis of the role of the 
FER gene in FAP and sporadic colorectal cancers. 

We considered FER as a candidate because of its proximity to 
the FAP locus as Judged by physical and genetic criteria (see 
Example 1), and its homology to known tyrosine kinases with oncogenic 
. potential. Primers were designed to PCR-amplify the complete coding 
sequence of FER from the RNA of two colorectal cancer cell lines 
derived from FAP patients. cDNA was generated from RNA and used 
as a template for PCR. The primers used were 
5'-AGAAGGATCCCTTGTGCAGTGTGGA-3' and 
5'-GACAGGATCCTGAAGCTGAGTTTG-3\ The underlined nucleotides 
were altered from the true FER sequence to create BamHI sites. The 
cell lines used were JW and Difi, both derived from colorectal cancers 
of FAP patients. <C. Paraskeva, B.G. Buckle, D. Sheer, C.B. Wlgley, 
Int. J. Cancer 34, 49 (1984); M.E. Gross et al., Cancer Res. 51, 1452 
{1991). The resultant 2554 base pair fragments were cloned and 
sequenced in their entirety. The PCR products were cloned in the 
BamHI site of Bluescript SK (Stratagene) and pools of at least 50 clones 
were sequenced en masse using T7 polymerase, as described in Nigro 
et al., Nature 342, 705 (1989). 

Only a single conservative amino acid change (GTG->CTG, cre- 
ating a val to leu substitution at codon 439) was observed. The region 
surrounding this codon was then amplified from the DNA of individuals 
without FAP and this substitution was found to be a common 
polymorphism, not specifically associated with FAP. Based on these 
results, we considered it unlikely (though still possible) the FER gene 
was responsible for FAP. To amplify the regions surrounding codon 
439, the following primers were used: 5'-TCAGAAAGTGCTGAAGAG-3' 
and S'-GGAATAATTAGGTCTCCAA-S'. PCR products were digested 
with PstI, which yields a 50 bp fragment if codon 439 is leucine, but 26 
and 24 bp fragments if it is valine. The primers used for sequencing 
were chosen from the FER cDNA sequence in Hao et ah, supra . 
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Example 3 

This example demonstrates the genetic analysis of MCC, TB2, 
SRP and APC in FAP and sporadic colorectal tumors. Each of these 
genes is linked and encompassed by contig 3 (see Figure 1). 

Several lines of evidence suggested that this contig was of par- 
ticular interest. First, at least three of the four genes in this contig 
were within the deleted region identified in two FAP patients. (See 
Example 5 infra.) Second, allelic deletions of chromosome 5q21 in spo- 
radic cancers appeared to be centered in this region. (Ashton-Rickardt 
et ai., Oncogene, in press; and Miki et al., Japn. J. Cancer Res., in 
press.) Some tumors exhibited loss of proximal RFLP markers (up to 
and potentially including the 5» end of MCC), but no loss of markers 
distal to MCC. Other tumors exhibited loss of markers distal to and 
perhaps including the 3' end of MCC, but no loss of sequences proximal 
to MCC. This suggested either that different ends of MCC were 
affected by loss in all such cases, or alternatively, that two genes (one 
proximal to and perhaps including MCC, the other distal to MCC) were 
separate targets of deletion. Third, clones from each of the six FAP 
region genes were used as probes on Southern blots containing tumor 
DNA from patients with sporadic CRC. Only two examples of somatic 
changes were observed in over 200 tumors studied: a 
rearrangement/deletion whose centromerie end was located within the 
MCC gene (Kinzler et al., supra ) and an 800 bp insertion within the 
APC gene between nucleotides 4424 and 5584. Fourth, point mutations 
of MCC were observed in two tumors (Kinzler et al.) supra strongly 
suggesting that MCC was a target of mutation in at least some sporadic 
colorectal cancers. 

Based on these results, we attempted to search for subtle alter- 
ations of contig 3 genes in patients with FAP. We chose to examine 
MCC and APC, rather than TB2 or SRP. because of the somatic muta- 
tions in MCC and APC noted above. To facilitate the identification of 
subtle alterations, the genomic sequences of MCC and APC exons were 
determined (see Table I). These sequences were used to design primers 
for PCR analysis of constitutional DNA from FAP patients. 
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We first amplified eight exons and surrounding introns of the 
MCC gene in affected individuals from 90 different FAP kindreds. The 
PCR products were analyzed by a ribonuclease (RNase) protein assay. 
In brief, the PCR products were hybridized to in vitro transcribed RNA 
probes representing the normal genomic sequences. The hybrids were 
digested with RNase A, which can cleave at single base pair mis- 
matches within DNA-RNA hybrids, and the cleavage products were 
visualized following denaturing gel electrophoresis. Two separate 
RNase protection analyses were performed for each exon, one with the 
sense and one with the antisense strand. Under these conditions, 
approximately 40% of all mismatches are detectable. Although some - 
amino acid variants of MCC were observed in FAP patients, ail such 
variants were found in a small percentage of norma) individuals. These 
variants were thus unlikely to be responsible for the inheritance of 
FAP. 

We next examined three exons of the APC gene. The three 
exons examined included those containing nt 822-930, 931-1309, and 
the first 300 nt of the most distal exon (nt 1956-2256). PCR and RNase 
protection analysis were performed as described in Kinzler et al. supra , 
using the primers underlined in Table I. The primers for nt 1956-2256 
were 5'-GCAAATCCTAAGAGAGAACAA-3' and 

5'-GATCGCAAGCTTGAGCCAG-3'. 

In 90 kindreds, the RNase protection method was used to screen 
for mutations and in an additional 13 kindreds, the PCR products were 
cloned and sequenced to search for mutations not detectable by RNase 
protection. PCR products were cloned into a Bluescript vector modi- 
fied as described in T.A, Holton and M.w. Graham, Nucleic Acids Res. 
19, 1156 (1991). A minimum of 100 clones were pooled and sequenced. 
Five variants were detected among the 103 kindreds analyzed. Cloning 
and subsequent DNA sequencing of the PCR product of patient P21 
indicated a C to T transition in codon 413 that resulted in a change 
from arginine to cysteine. This amino acid variant was not observed in 
any of 200 DNA samples from individuals without FAP. Cloning and 
sequencing of the PCR product from patients P24 and P34, who demon- 
strated the same abnormal RNase protection pattern indicated that 
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both had a C to T transition at codon 301 that resulted in a change 
from arginine (CGA) to a stop codon (TGA). This change was not 
present in 200 individuals without FA P. As this point mutation resulted 
in the predicted loss of the recognition site for the en2yme Taq I f 
appropriate PCR products could be digested with Taq I to detect the 
mutation. This allowed us to determine that the stop codon 
co-segregated with disease phenotype in members of the family of P24. 
The inheritance of this change in affected members of the pedigree 
provides additional evidence for the importance of the mutation. 

Cloning and sequencing of the PCR product from FAP patient 
P93 indicated a C to G transversion at codon 279, also resulting in a 
stop codon (change from TCA to TGA). This mutation was not present 
in 200 individuals without FAP. Finally, one additional mutation result- 
ing in a serine (TCA) to stop codon (TGA) at codon 712 was detected in 
a single patient with FAP (patient P60). 

The five germline mutations identified are summarized in 
Table HA, as well as four others discussed in Example 9. In addition to 
these germline mutations, we identified several somatic mutations of 
MCC and APC in sporadic CRC's. Seventeen MCC exons were exam- 
ined in 90 sporadic colorectal cancers by RNase protection analysis. In 
each case where an abnormal RNase protection pattern was observed, 
the corresponding PCR products were cloned and sequenced. This led 
to the identification of six point mutations (two described previously) 
(Kinzler et al M supra ), each of which was not found in the germline of 
these patients (Table KB). Four of the mutations resulted in amino acid 
substitutions and two resulted in the alteration of splice site consensus 
elements. Mutations at analogous splice site positions in other genes 
have been shown to alter RNA processing in vivo and in vitro . 

Three exons of APC were also evaluated in sporadic tumors. 
Sixty tumors were screened by RNase protection, and an additional 98 
tumors were evaluated by sequencing. The exons examined included nt 
822-930, 931-1309, and 1406-1545 (Table I). A total of three mutations 
were identified, each of which proved to be somatic. Tumor T27 con- 
tained a somatic mutation of CGA (arginine) to TGA (stop codon) at 
codon 33. Tumor T135 contained a GT to GC change at a splice donor 
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Site. Tumor T34 contained a 5 bp insertion (CAGCC between codons 
288 and 289) resulting in a stop at codon 291 due to a frameshift. 

We serendipitously discovered one additional somatic mutation In 
a colorectal cancer. During our attempt to define the sequences and 
splice patterns of the MCC and APC gene products in colorectal 
epithelial cells, we cloned cONA from the colorectal cancer cell line 
SW430. The amino acid sequence of the MCC gene from SW480 was 
identical to that previously found in clones from human brain. The 
sequence of APC In SWI80 cells, however, differed significantly, in 
that a transition at codon 1338 resulted in a change from glutamine 
(CAG) to a stop codon (TAG). To determine if this mutation was 
somatic, we recovered DNA from archival paraffin blocks of the origi- 
nal surgical specimen (T201) from which the tumor cell line was 
derived 28 years ago. 

DNA was purified from paraffin sections as described in S.E. 
Goelz, S.R. Hamilton, and B. Vogelstein. Biochem. fiiophys. Res. 
Comm. 130, 118 (1985). PCR was performed as described In reference 
24, using the primers 5 I -GTTCCAGCAGTGTCACAG-3 I and 
S'-GGGA G A TTTCGCTCCTG A -3 1 . A PCR product containing codon 
1338 was amplified from the archival DNA and used to show that the 
stop codon represented a somatic mutation present in the original pri- 
mary tumor and in cell lines derived from the primary and metastatic 
tumor sites, but not from norma] tissue of the patient. 

The ten point mutations in the MCC and APC genes so far dis- 
covered in sporadic CRCs are summarized in Table QB. Analysis of the 
number of mutant and wild-type PCR clones obtained from each of 
these tumors showed that in eight of the ten cases, the wild-type 
sequence was present in approximately equal proportions to the 
mutant. This was confirmed by RFLP analysis using flanking markers 
from chromosome Sq which demonstrated that only two of the ten 
tumors (T135 and T201) exhibited an allelic deletion on chromosome 5q. 
These results are consistent with previous observations showing that 
20-40% of sporadic colorectal tumors had aileUc deletions of enromo- 
some 5q. Moreover, these data suggest that mutations of Sq2l genes 



WO 92/13103 



-30- 



PCI7US92/00376 



arc not limited to those colorectal tumors which contain allelic dele- 
tions of this chromosome. 
Example 4 

This example characterizes small, nested deletions in DNA from 
two unrelated FAP patients. 

DNA from 40 FAP patients was screened with cosmids that had 
been mapped into a region near the APC locus to identify small dele- 
tions or rearrangements. Two of these cosmids, L5.71 and L5.79, 
hybridized with a 1200 kb NotI fragment in DNAs from most of the FAP 
patients screened. 

The DNA of one FAP patient, 3214, showed only a 940 kb NotI 
fragment iistead of the expected 1200 kb fragment. DNA was ana- 
lyzed from four other members of the patient's immediate family; the 
910 kb fragment was present in her.affected mother (4711), but not in 
the other, unaffected family members. The mother also carried a nor- 
mal 1200 kb NotI fragment that was transmitted to her two unaffected 
offspring. These observations indicated that the mutant polyposis 
allele is on the same chromosome as the 940 kb NotI fragment. A sim- 
ple Interpretation is that APC patients 3214 and 4711 each carry a 260 
kb deletion within the APC locus. • 

If a deletion were present, then other enzymes might also be 
expected to produce fragments with altered mobilities. Hybridization 
of L5.79 to Nrul-digested DNAS from both affected members of the 
family revealed a novel Nrul fragment of 1300 kb, in addition to the 
normal 1200 kb Nnd fragment. Furthermore, Mlul fragments in 
patients 3214 and 4711 also showed an increase in size consistent with 
the deletion of an Mlul site. The two chromosome 5 homologs of 
patient 3214 were segregated In somatic ceil hybrid lines; HHW1155 
(deletion hybrid) carried the abnormal homolog and HHW1159 (normal 
hybrid) carried the normal homolog. 

Because patient 3214 showed only a 940 kb NotI fragment, she 
had not inherited the 1200 kb fragment present in the unaffected 
fathers DNA. This observation suggests that he must be heterozygous 
for, and have transmitted, either a deletion of the LS.79 probe region 
or a variant NotI fragment too large to resolve on the gel system, as 
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expected, the hybrid cell line HHWH59, which carries the paternal 
homolog, revealed no resolved Not fragment when probed with L5.79. 
However, probing of HHW1159 DNA with L5.79 following digestion with 
other enzymes did reveal restriction fragments, demonstrating the 
presence of DNA homologous to the probe. The father is, therefore, 
interpreted as heterozygous for a polymorphism at the NotI site, with 
one chromosome 5 having a 1200 Kb Notl fragment and the other hav- 
ing a fragment too large to resolve consistently on the gel. The latter 
was transmitted to patient 3214. 

When double digests were used to order restriction sites within 
the 1200 kb Notl fragment, L5.71 and L5.79 were both found to lie on a 
550 kb Notl-Nrul fragment and, therefore, on the same side of an Nrul 
site in the 1200 kb Notl fragment. To obtain genomic representation of 
sequences present over the entire 1200 kb Notl fragment, we con- 
structed a library of small-fragment inserts enriched for sequences 
from this fragment. DNA from the somatic cell hybrid HHW141, which 
contains about 40% of chromosome 5, was digested with Notl and 
electrophoresed under pulsed-fieM gel (PFG) conditions; EcoRI frag- 
ments from the 1200 kb region of this gel were cloned into a phage 
vector. Probe Map30 was isolated from this library. In normal individ- 
uals probe Map30 hybridizes to the 1200 kb Notl fragment and to a 200 
kb Nrul fragment. This latter hybridization places Map30 distal, with 
respect to the locations of L5.71 and L5.79, to the Nrul site of the 550 
kb Notl-Nrul fragment. 

Because Map30 hybridized to the abnormal, 1300 kb Nrul frag- 
ment of patient 3214, the locus defined by Map30 lies outside the 
hypothesized deletion. Furthermore, in normal chromosomes Map30 
identified a 200 kb Nrul fragment and L5.79 identified a 1200 kb Nrul 
fragment; the hypothesized deletion must, therefore, be removing an 
Nrul site, or sites, lying between Map30 and L5.79, and these two 
probes must flank the hypothesized deletion. A restriction map of the 
genomic region, showing placement of these probes, is shown in 
Figures. 

A Notl digest of DNA from another FAP patient, 3824, was 
probed with L5.79. In addition to the 1200 kb normal Notl fragment, a 
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Iragment of approximately llM Kb was observed,_consistent with the 
presence of a 100 kb deletion In one chromosome 5. In this case, how- 
ever, digestion with Nrul and MM did not reveal abnormal bands, indi- 
cating that If a deletion were present, its boundaries must lie distal to 
the Nrul and MM sites of the fragments identified by L5.79. Consis- 
tent with this expectation, hybridization of Map30 to DNA from 
patient 3824 identified a 760 kb Mlul fragment in addition to the 
expected 860 kb fragment, supporting the interpretation of a 100 kb 
deletion in this patient. The two chromosome 5 homologs of patient 
3824 were segregated to somatic eeU hybrid lines; HHW1291 was found 
to carry only the abnormal homolog and HHW1290 only the normal 
homoiog. 

That the 860 kb Mlul fragment identified by MapSO is distinct 
from the 830 kb Mlul fragment identified previously by L5.79 was dem- 
onstrated by hybridization of MapSO and L5.79 to a Notl-Mlul double 
digest of DNA from the hybrid cell (HHWU59) containing the 
nondeleted chromosome 5 homolog of patient 3214. As previously indi- 
cated, this hybrid is interpreted as missing one of the NotI sites that 
define the 1200 kb fragment. A 620 kb Notl-Mlul fragment was seen 
with probe L5.79, and an 860 kb fragment was seen with Map30. 
Therefore, the 830 kb Mlul fragment recognized by probe L5.79 must 
contain a NotI site in HHWU59 DNA; because the 860 kb Mlul fragment 
remains intact, it does not carry this NotI site and must be distinct 
from the 830 kb Mlul fragment. 
Example 5 

This example demonstrates the isolation of human sequences 
which span the region deleted in the two unrelated FAP patients char- 
acterized in Example 4. 

A strong prediction of the hypothesis that patients 3214 and 
3824 carry deletions is that some sequences present on normal chromo- 
some S homologs would be missing from the hypothesized deletion 
homologs. Therefore, to develop genomic probes that might confirm 
the deletions, as well as to identify genes from the region, YAC clones 
from a contig seeded by cosmid L5.79 were localized from a library 
containing seven haploid human genome equivalents (Albertsen et al., 
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Proc. Natl. Acad. Sd. U.S.A., Vol. 87, pp. 4256-4260 (1990)) with 
respect to the hypothesized deletions. Three clones, YACs 57B8. 
310D8, and 1S3H12, were found to overlap the deleted region. 

Importantly, one end of YAC S7B8 (clone AT57) was found to lie 
within the patient 3214 deletion. Inverse polymerase chain reaction 
(PCR) defined the end sequences of the insert of YAC 57B8. PCR 
primers based on one of these end sequences repeatedly lailed to 
amplify DNA from the somatic cell hybrid (HHW1155) carrying the 
deleted homolog of patient 3214, but did amplify a product of the 
expected sire from the somatic cell hybrid (HHW1159) carrying the 
normal chromosome 5 homolog. This result supported the interpreta- 
tion that the abnormal restriction fragments found in the DNA of 
patient 3214 result from a deletion. 

Additional support for the hypothesis of deletion in DNA from 
patient 3214 came from subcloned fragments of YAC 183H12, which 
spans the region in question. Yll, an EcoRI fragment cloned from 
YAC 183H12, hybridized to the normal, 1200 kb NotI fragment of 
patient 4711, but failed to hybridize to the abnormal, 940 kb NotI frag- 
ment of 4711 or to DNA from deletion cell line HH Wl 155. This result 
confirmed the deletion in patient 3214. 

Two additional EcoRI fragments from YAC 183H12, Y10 and 
Y14, were localized within the patient 3214 deletion by their failure to 
hybridizie to DNA from HHW1155. Probe Y10 hybridizes to a ISO kb 
Nrul fragment In normal chromosome 5 homologs. Because the 3214 
deletion creates the 1300 kb Nrul fragment seen with the probes L5.79 
and Map30 that flank the deletion, these Nrul rites and the 150 kb Nrul 
fragment lying between must be deleted in parent 3214. Furthermore, 
probe Y10 hybridizes to the same 620 kb Notl-Mlul fragment seen with 
probe L3.79 in normal DNA, indicating its location as L5.79-proximal to 
the deleted Mlul site and placing it between the Mlul site and the 
L5.79-proximal Nrul site. The Mlul site must, therefore, lie between 
the Nrul sites that define the 150 kb Nrul fragment (see Figure 5). 

Probe Yll also hybridized to the 150 kb Nrul fragment in the 
normal chromosome 5 homolog, but failed to hybridize to the 620 kb 
Notl-Mlul fragment, placing it L5.79-distal to the Mlul site, but 
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proximal to the second Nrul site. Hybridization to the same (860 kb) 
Mlul fragment as Map30 confirmed the localization of probe YXl 

L5.79-dlstal to the Mlul site. 

Probe Y14 was shown to be L5.79-dlstal to both deleted Nrul 
sites by virtue of its hybridization to the same 200 kb Nrul fragment of 
the normal chromosome 5 seen with MapSO. Therefore, the order of 
these EeoRI fragments derived from YAC 183H12 and deleted in 
patient 3214. with respect to L5.79 and Map30, is 
L5.79-Yl0-Yll-Y14-Map30. 

The 100 Kb deletion of patient 3824 was confirmed by the failure 
of aberrant restriction fragments in this DNA to hybridize with probe 
YXl. combined with positive hybridizations to probes Y10 and/or Y14. 
Y10 and Y14 each hybridized to the 1100 kb NotI fragment of patient 
3824 as well as to the normal 1200 kb NotI fragment, but Ytl hybrid- 
ized to the 1200 kb fragment only. In the Mlul digest, probe Y14 
hybridized to the 860 kb and 760 kb fragments of patient 3824 DNA, but 
probe Yll hybridized only to the 860 kb fragment. We conclude that 
the basis for the alteration in fragment size in DNA from patient 3824 
is. indeed, a deletion. Furthermore, because probes Y10 and Y14 are 
missing from the deleted 3214 chromosome, but present on the deleted 
3824 chromosome, and they have been Shown to flank probe Yll. the 
deletion in patient 3824 must be nested within the patient 3214 
deletion. 

Probes Y10, Yll, Y14 and Map30 each hybridized to YAC 310D8. 
indicating that this YAC spanned the patient 3824 deletion and at a 
minimum, most of the 3214 deletion. The YAC characterizations, 
therefore, confirmed the presence of deletions in the patients and pro- 
vided physical representation of the deleted region. 
Example 6 

This example demonstrates that the MCC coding sequence maps 
outside of the region deleted in the two FAP patients characterized in 
Example 4. 

An Intriguing FAP candidate gene. MCC. recently was ascer- 
tained with cosmid LS.71 and was shown to have undergone mutation in 
colon carcinomas (Kinzler et al., supra). It was therefore of interest to 
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map this gene with respect to the deletions in APC patients. Hybrid- 
ization of MCC probes with an overlapping series of YAC clones 
extending in either direction from L5.71 showed that the 3' end of MCC 
must be oriented toward the region of the two APC deletions. 

Therefore, two 3' cDNA clones from MCC were mapped with 
respect to the deletions: clone 1CI (bp 2378-4181) and clone 7 (bp 
2890-3560). Clone 1C1 contains sequences from the C-terminal end of 
the open reading frame, which stops at nucleotide 2708, as well as 3' 
untranslated sequence. Clone 7 contains sequence that is entirely 3 f to 
the open reading frame. Importantly, the entire 3* untranslated 
sequence contained in the cDNA clones consists of a single 2.5 kb exon. 
7h«e two clones were hybridized to DNAs from the YACs spanning the 
FAP region. Clone 7 falls to hybridize to YAC 310D8, although it does 
hybridize to YACs 183H12 and 57B8; the same result was obtained with 
the cDNA 1C1. Furthermore, these probes did show hybridization to 
DNAs from both hybrid cell lines (HWW1159 and HWWU55) and the 
lymphoblastoid cell line from patient 3214, confirming their locations 
outside the deleted region. Additional mapping experiments suggested 
that the 3 ! end of the MCC cDNA clone contig is likely to be located 
more than 45 kb from the deletion of patient 3214 and, therefore, more 
than 100 kb from the deletion of patient 3824. 
Example 7 

This example Identifies three genes within the deleted region of 
chromosome 5 in the two unrelated FAP patients characterized in 
Example 4. 

Genomic clones were used to screen cDNA libraries in three 
separate experiments. One screening was done with a phage clone 
derived from YAC 310D8 known to span the 260 kb deletion of patient 
3214. A large-insert phage library was constructed from this YAC; 
screening with Yll identified X205, which mapped within both dele- 
tions. When clone X205 was used to probe a random-, plus oligoMTh 
primed fetal brain cDNA library (approximately 300,000 phage), six 
cDNA clones were isolated and each of them mapped entirely within 
both deletions. Sequence analysis of these six clones formed a single 
cDKA contig, but did not reveal an extended open reading frame. One 
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ol the six cDNAS was used to isolate more cDNA clones, some of which 
crossed the L5.7l-proximal breakpoint of the 3824 deletion, as indi- 
cated by hybridization to both chromosome of this patient. These 
clones also contained an open reading frame, indicating a transcrip- 
tional orientation proximal to distal with respect to L5.T1. This gene 
was named DPI (deleted In polyposis 1). This gene is identical to TB2 

described above. 

cDKA walks yielded a cDNA eontig of 3.0-3.S kb, and included 
two clones containing terminal polyfA) sequences. This size corre- 
sponds to the 3.5 kb band seen by Northern analysis. Sequencing of the 
first 3163 bp of the cDNA contig revealed an open reading frame 
extending from the first base to nucleotide 631. followed by a 2.5 kb 3' 
untranslated region. The sequence surrounding the methionine codon 
at base 77 conforms to the Kozak consensus of an initiation methionine 
(Kozak, 1984). Failed attempts to walk farther, coupled with the simi- 
larity of the lengths of isolated cDNA and mRNA, suggested that the 
NH 2 -terminus of the DPI protein had been reached. Hybridization to a 
combination of genomic and YAC DNAs cut with various enzymes indi- 
cated the genomic coverage of DPI to be approximately 30 kb. 

Two additional probes for the locus, YS-ll and YS-39, which had 
been ascertained by screening of a cDNA library with an independent 
YAC probe identified with MCC sequences adjacent to L5.71, were 
mapped into the deletion region. YS-39 was shown to be a cDNA iden- 
tical to sequence to DPI. Partial characterization of YS-ll had shown 
that 200 bp of DKA sequence at one end was identical to sequence cod- 
ing for the 19 kd protein of the ribosomal signal recognition particle, 
SRP19 (Lingelbach et aL, supra). Hybridization experiments mapped 
YS-ll within both deletions. The sequence of this clone, however, was 
lound to be complex. Although 454 bp of the 1032 bp sequence of 
YS-ll were identical to the GenBank entry for the SRP19 gene, 
another 578 bp appended 3' to the SRP19 sequence was found to consist 
of previously unreported sequence containing no extended open reading 
frames. This suggested that YS-ll was either a chimeric clone con- 
taining two independent inserts or a clone of an incompletely processed 
or aberrant message. If YS-ll were a conventional chimeric clone, the 
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independent segments would not be expected to map to the same physi- 
cal region. The segments resulting from anomalous processing of a 
continuous transcript, however* would map to a single chromosomal 
region. 

Inverse PCR with primers specific to the two ends of YS-11, the 
SRP19 end and the unidentified region, verified that both sequences 
map within the YAC 310D8; therefore, YS-11 is most likely a clone of 
an immature or anomalous mRNA species. Subsequently, both ends 
were shown to lie with the deleted region of patient 3824, and YS-11 
was used to screen for additional cDKA clones. 

Of the 14 cDNA clones selected from the fetal brain library, one 
clone, V5, was of particular interest in that it contained an open read- 
ing frame throughout, although it Included only a short identity to the 
first 78 5' bases of the YS-11 sequence. Following the 78 bp of identi- 
cal sequence, the two cDNA sequences diverged at an AG. Further- 
more, divergence from genomic sequence was also seen after these 78 
bp, suggesting the presence of a splice junction, and supporting the 
view that YS-11 represents an irregular message. 

Starting with V5, successive 5' and 3' walks were performed; the 
resulting cDNA contig consisted of more than 100 clones, which 
defined a new transcript, DP2. Clones walking in the 5* direction 
crossed the 3824 deletion breakpoint farthest from L5.71; since its 3' 
end is closer to this cosmid than its 5' end. the transcriptional orienta- 
tion of DP2 is opposite to that of MCC and DPI. 

The third screening approach relied on hybridization with a 120 
kb Mlul fragment from YAC S7B8. This fragment hybridizes with probe 
Yll and completely ipans the 100 kb deletion in patient 3824. the 
fragment was purified on two preparative PFGs, labeled, and used to 
screen a fetal brain cDNA library. A number of cONA clones previ- 
ously identified in the development of the DPI and DP2 contigs were 
reascertained. However, 19 new cDNA clones mapped into the patient 
3824 deletion. Analysis indicated that these 19 formed a new contig, 
DPS, containing a large open reading frame. 

A clone from the 5' end of this new cDNA contig hybridized to 
the same EcoRI fragment as the 3* end of DP2. Subsequently, the DP2 
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and DP3 contigs were connected by a single 5' walking step from DP3, 
to form the single contlg DP2.5. The complete nucleotide sequence of 
DP2.5 is Shown in Figure 9. 

The consensus cDNA sequence or DP2.5 suggests that the entire 
coding sequence of DP2.5 has been obtained and is 8532 bp long. The 
most 5' ATG codon occurs two codons from an in-frame stop and con- 
forms to the Kozak Initiation consensus (Kozak, Nucl. Acids. Res., 
VoL 12, p. 8S7-872 1984). The 3' open reading frame breaks down over 
the final 1.8 kb, giving multiple stops in all frames. A poly(A) sequence 
was found in one clone approximately l kb into the 3' untranslated 
region, associated with a polyadenylation signal 33 bp upstream (posi- 
tion 9530). The open reading frame is almost identical to that identi- 
fied as APC above. 

An alternatively spliced exon at nucleotide 934 of the DP2.5 
transcript Is of potential interest, it was first discovered by noting 
that two classes of cDNA had been isolated. The more abundant cDNA 
class contains a 303 bp exon not included In the other. The presence in 
vivo of the two transcripts was verified by an exon connection experi- 
ment. Primers flanking the alternatively spliced exon were used to 
amplify, by PCR, cDNA prepared from various adult tissues. Two PCR 
products that differed in s«e by approximately 300 cases were ampli- 
fied from all the tissues tested; the larger product was always more 
abundant than the smaller. 
Example 8 

This example demonstrates the primers used to Identify, subtle 
mutations in DPI. SRP19. and DP25. 

To obtain DNA sequence adjacent to t..e exons of the genes DPI, 
DP2.5, and SRP19, sequencing substrate was obtained by inverse PCR 
amplification of DMAs from two YACs. 310D8 and 183H12, that span 
the deletions. Ligation at low concentration cyclized the restriction 
ewyme-tiigested YAC DNAs. Oligonucleotides with sequencing tails, 
designed in inverse orientation at intervals along the cDNAs. primed 
PCR amplification from the cyclized templates. Comparison of these 
DNA sequences with the cDNA sequences placed exon boundaries at 
the divergence points. SRP19 and DPI were each shown to have five 
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exons. DP2.5 consisted of 15 exons. The sequences of the 
oligonucleotides synthesized to provide PCR amplification primers for 
the exons of each of these genes arc listed in Table III. With the excep- 
tion of exons 1. 3, 4. 9, and 15 of DP2.5 (see below), the primer 
sequences were located in intron sequences flanking the exons. The 5* 
primer of exon 1 is complementary to the cDNA sequence, but extends 
just into the 5 1 Kozak consensus sequence for the initiator methionine, 
allowing a survey of the translated sequences. The 5' primer of exon 3 
Is actually in the 5' coding sequences of this exon, as three separate 
intronic primers simply would not amplify. The 5 T primer of exon 4 just 
overlaps the 5' end of this exon, and we thus fail to survey the 19 most 
5 ! bases of this exon. For exon 9 t two overlapping primer sets were 
used, such that each had one end within the exon. For exon 15, the 
large 3' exon of DP2.5, overlapping primer pairs were placed along the 
length of the exon; each pair amplified a product of 250-400 bases. 
Example g 

This example demonstrates the use of single stranded conforma- 
tion polymorphism (SSCP) analysis as described by Orita et al. Proc. 
Natl. Acad. Sci. U.S.A., Vol. 86, pp. 2766-70 <1989) and Genomics, 
vol. 5, pp. 874-879 (1989) as applied to DPI, SRP19 and DP2.5. 

SSCP analysis identifies most single- or multiple-base changes in 
DNA fragments up to 400 bases in length. Sequence alterations are 
detected as shifts in electrophoretic mobility of single-stranded DNA 
on nondenaturing acrylamide gels; the two complementary strands of a 
DNA segment usually resolve as two SSCP conformers of distinct 
mobilities. However, if the sample is from an individual heterozygous 
for a base-pair variant within the amplified segment, often three or 
more bands are seen. In some cases, even the sample from a 
homozygous individual will show multiple bands. Base-pair-change 
variants are identified by differences in pattern among the DNAs of 
the sample set. 

Exons of the candidate genes were amplified by PCR from the 
DNAs of 61 unrelated FAP patients and a control set of 12 normal indi- 
viduals. The five exons from DPI revealed no unique conformers in the 
FAP patients, although common conformers were observed with exons 
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2 and 3 in some individuals of both affected and control sets, indicating 
the presence of DNA sequence polymorphisms. Likewise, none of the 
five exons of SRP19 revealed unique conformers in DNA from FAP 
patients in the test panel. 

Testing of exons i through 14 and primer sets A through K of 
exon 15 of the DP2.5 gene, however, revealed variant conformers spe- 
cific to FAP patients in exons 7. 8. 10, 11. and 15. These variants were 
in the unrelated patients 3746, 3460. 3827. 3712. and 3751. respectively. 
The PCR-SSCP procedure was repeated for each of these exons in the 
five affected individuals and in an expanded set of 48 normal controls. 
The variant bands were reproducible in the FAP patients but were not 
observed in any of the control DNA samples. Additional variant con- 
loraers in exons 11 and IS of the DP2.5 gene were seen; however, each 
of these was found in both the affected and control DNA sets. The five 
sets of conformers unique to the FAP patients were sequenced to 
determine the nucleotide changes responsible for their altered nobUi- 
ties. The normal conformers from the host Individuals were sequenced 
also. Bands were cut from the dried aerylamide gels, and the DNA was 
eluted. PCR amplification of these DNAs provided template for 
sequencing. 

The sequences of the unique conformers from exons 7, 8, 10, and 
11 of DP2.5 revealed dramatic mutations in the DP2.5 gene. The 
sequence of the new mutation creating the exon 7 conformer in patient 
3746 was shown to contain a deletion of two adjacent nucleotides, at 
positions 730 and 731 in the cDNA sequence (Figure 7). The normal 
sequence at this splice Junction is CAGGGTCA (intronic sequence 
underlined), with the intron-exon boundary between the two repetitions 
of AG. The mutant allele in this patient has the sequence CAGGTCA. 
Although this change is at the 5' splice site, comparison with known 
consensus sequences of splice Junctions would suggest that a functional 
splice Junction is maintained. If this new splice junction were func- 
tional, the mutation would introduce a frameshift that creates a stop 
codon 15 nucleotides downstream. If the new splice junction were not 
functional, messenger processing would be significantly altered. 
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To confirm the 2-basc deletion, the PCR product from FAP 
patient 3746 and a control DNA were eiec trophor esed on an 
acrylamide-urea denaturing gel, alone with the products of a sequenc- 
ing reaction. The sample from patient 3746 showed two bands differing 
in size by 2 nucleotides, with the larger band identical in mobility to 
the control sample; this result was independent confirmation that 
patient 3746 is heterozygous for a 2 bp deletion. 

The unique conformer found in exon 8 of patient 3460 was found 
to carry a C-T transition, at position 904 in the cDNA sequence of 
DP2.S (shown in Figure 7), which replaced the normal sequence of CGA 
with TGA. This point mutation, when read in frame, results in a stop 
codoo replacing the normal arginine codon. This single-base change 
had occurred within the context of a CG dimer, a potential hot spot for 
mutation (Barker et al. f 1984). 

The conformer unique to FAP patient 3827 in exon 10 was found 
to contain a deletion of one nucleotide (1367, 1368, or 1369) when com- 
pared to the normal sequence found in the other bands on the SSCP gel. 
This deletion, occurring within a set of three Ps, changed the sequence 
from CTTTCA to CTTCA; this 1 base f rameshif t creates a downstream 
stop within 30 bases. The PCR product amplified from this patient's 
DNA also was electrophoresed on an acrylamide-urea denaturing gel, 
along with the PCR product from a control DNA and products from a 
sequencing reaction. The patient's PCR product showed two bands 
differing by 1 bp in length, with the larger identical in mobility to the 
PCR product from the normal DNA; this result confirmed the presence 
of a 1 bp deletion in patient 3827. 

Sequence analysis of the variant conformer of exon 11 from 
patient 3712 revealed the substitution of a T by a G at position 1500, 
changing the normal tyrosine codon to a stop codon. 

The pair of conformers observed in exon 15 of the DP2.5 gene 
for FAP patient 3751 also was sequenced. These conformers were 
found to carry a nucleotide substitution of C to G at position 5253, the 
third base of a valine codon. No amino acid change resulted from this 
substitution, suggesting that this conformer reflects a genetically silent 
polymorphism. 
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The observation of distinct inactivating mutations in the DP2.5 
gene in four unrelated patients strongly suggested that DP2.5 is the 
gene Involved In FAP. These mutations are summarized in Table HA. 

This example demonstrates that the mutations identified in the 
DP2.5 (APC) gene segregate with the FAP phenotype. 

Patient 3746, described above as carrying an APC allele with a 
frameshift mutation, is an affected offspring of two normal parents. 
Colonoscopy revealed no polyps in either parent nor among the 
patients three siblings. 

DNA samples from both parents, from the patient's wife, and 
from their three children were examined. SSCP analysis of DNA from 
both of the patient's parents displayed the normal pattern of conform- 
ed for exon 7, as did DNA from the patients's wife and one of his off- 
spring. The two other children, however, displayed the same new con- 
formed as their affected father. Testing of the patient and his parents 
with highly polymorphic VNTR (variable number of tandem repeat) 
markers showed a 99.98% likelihood that they are his biological 
parents. 

These observations confirmed that this novel conformer, known 
to reflect a 2 bp deletion mutation in the DP2.5 gene, appeared sponta- 
neously with FAP in this pedigree and was transmitted to two of the 
children of the affected individual. 
Example 11 

This example demonstrate? polymorphisms in the APC gene 
which appear to be u> related to disease (FAP). 

Sequencing of variant conformers found among controls as well 
as Individuals with APC has revealed the following polymorphisms in 
the APC gene: first, in exon 11, at position 1458, a substitution of T to 
C creating an Rsal restriction site but no amino acid change; and sec- 
ond, In exon 15, at positions 5037 and 5271, substitutions of A to G and 
G to T, respectively, neither resulting in amino acid substitutions. 
These nucleotide polymorphisms in the APC gene sequence may be 
useful for diagnostic purposes. 
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Example 12 

This example shows the structure of the A PC gene. 

me structure of the A PC gene is schematically shown in 
Figure 8, with flanking intron sequences indicated. 

The continuity of the very large (6.5 kb). most 3' exon in DP2.5 
was shown in two ways. First, inverse PCR with primers spanning the 
entire length of this exon revealed no divergence of the cDKA 
sequence from the genomic sequence. Second. PCR amplification with 
converging primers placed at Intervals along the exon generated prod- 
ucts of the same size whether amplified from the originally isolated 
cDNA, cDNA from various tissues, or genomic template. Two forms of 
exon 9 were found In DP2.5: one is the complete exon; and the other, 
labeled exon 9A, is the result of a splice into the interior of the exon 
that deletes bases 934 to 1236 in the mRNA and removes 101 amino 
acids from the predicted protein (see Figure 7). 
Example 13 

This example demonstrates the mapping of the FAP deletions 
with respect to the APC exons. 

Somatic cell hybrids carrying the segregated chromosomes 5 
from the 100 kb (HHW1291) and 260 kb (KHW1155) deletion patients 
were used to determine the distribution of the APC genes exons across 
the deletions. DNAs from these cell lines were used as template, along 
with genomic DNA from a normal control, for PCR-based amplification 
of the APC exons. 

PCR analysis of the hybrids from the 260 kb deletion of patient 
3214 showed that all but one (exon 1) of the APC exons are removed by 
this deletion. PCR analysis of the somatic ciJ hybrid HHW1291. carry- 
ing the chromosome 5 homolog with the 100 kb deletion from patient 
3824, revealed that exons l through 9 are present but exons 10 through 
15 are missing. This result placed the deletion breakpoint either 
between exons 9 and 10 or within exon 10. 
Example 14 

This example demonstrates the expression of alternately spliced 
APC messenger in normal tissues and in cancer cell lines. 
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Tissues that express the APC gene were identified by PCR 
amplification of cDNA made to mRNA with primers located within 
adjacent APC exons. In addition, PCR primers that flank the alterna- 
tively spliced exon 9 were chosen so that the expression pattern of 
both splice forms could be assessed. All tissue types tested (brain, lung, 
aorta, spleen, heart, kidney, liver, stomach, placenta, end colonic 
mucosa) and cultured cell lines (lymphoblasts, HL60, and 
choriocarcinoma) expressed both splice forms of the APC gene. We 
note, however, that expression by lymphocytes normally residing in 
some tissues, including colon, prevents unequivocal assessment of 
expression. The large mRNA, containing the complete exon 9 rather 
than only exon 9A, appears to be the more abundant message. 

Northern analysis of poiy(A)-selected RNA from lymphoblasts 
revealed a single band of approximately 10 kb, consistent with the size 
of the sequenced cDNA. 
Examole IS 

This example discusses structural features of the APC protein 
predicted from the sequence. 

The cDNA consensus sequence of APC predicts that the longer, 
more abundant form of the message codes for a 2842 or 28444 amino 
acid peptide with a mass of 311.8 kd. This predicted APC peptide was 
compared with the current data bases of protein and DNA sequences 
using both Intelligenetics and GCG software packages. No genes with a 
high degree of amino acid sequence similarity were found. Although 
many short (approximately 20 amino add) regions of sequence similar- 
ity were uncovered, none was sufficently strong to reveal which. If 
any, might represent functional homology. Interestingly, multiple simi- 
larities to myosins and keratins did appear. The APC gene also was 
scanned for sequence motifs of known function; although multiple 
glycosylation, phosphorylation, and myristoyiation sites were seen, 
their significance Is uncertain. 

Analysis of the APC peptide sequence did identify features 
important in considering potential protein structure. Hydropathy plots 
(Kyte and DooUttle, J. Mol. BioL Vol. 157, pp. 105-132 (1982)) indicate 
that the APC protein Is notably hydrophilic. No hydrophobic domains 
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suggesting a signal peptide or a membrane-spanning domain were 
found. Analysis of the first 1000 residues indicates that o-helical rods 
may form (Cohen and Parry. Trends Biochem. Sel. Vol. 77, pp. 245-248 
(1986); there is a scarcity of proline residues and, there are a number of 
regions containing heptad repeats (apolar-X-X-apolar-X-X-X). Inter- 
estingly, in exon 9A, the deleted form of exon 3, two heptad repeat 
regions are reconnected In the proper heptad repeat frame, deleting 
the intervening peptide region. After the first 1000 residues, the high 
proline content of the remainder of the peptide suggests a compact 
rather than a rod-like structure. 

The most prominent feature of the second 1000 residues is a 20 
amino acid repeat that is iterated seven times with semiregular spacing 
(Table 4). The intervening sequences between the seven repeat regions 
contained 114, 116, 151, 205, 107, and 58 amino acids, respectively. 
Finally, residues 2200*24000 contain a 200 amino acid basic domain. 
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(1) GENERAL INFORMATION; 

111 APPLICANT: ALBERTS EN # HA*S 

CARLSON/ MART 
CRODEK, JOANNA 
HEDGE, PHILIP J. 
JOSLXN, GEOFF 
KIHZLER* RENNETH 
MAJUUUK, ALEXANDER r. 
NAXAK0RA, TUSUKE 
TKLIVERIS, ANDREW 

(iii) NUMBER OF SEQUENCES; 94 

(iv) CORRESPONDENCE ADDRESS: ■. c w«tt 

(A) ADDRESSEE: B*nn«r, Birch, McKi* t Bccltttt 

(B) STREXT: 1001 C Str«tt, NW 

(C) CXTVi Washington 

(DJ STATES D.C. 

(E) COUNTRY: USA 

(F) 2IP« 20001-4S9A 

(V) COMPUTER READABLE FORM: 
(A) MEDIUM TTPE: Floppy 
fB COMPUTER: IBM PC compatible 
e OPERATING SI STEM: PC-DOS/MS-DOS 
!SJ 5oS££T fit. ntlft MMi #1.0. ver-xon «.25 

/vii CURRENT APPLICATION DATA: flJrt 

(V1) C wwptxanoi number: us ov^i.wo 

(B riLIKC DATE: 06-ADG-1991 
(C) CLASSIFICATION: 



(viil) ATTORNEY/ACEKT INFORMATION 
1 (A) MAKE I JUgan, m 

<B REGISTRATION NUMBER: 32,141 
(C) REFERENCE/DOCKET NUMBER: 110- 

fix! TELECOMMUNICATE INFORMATION: 
1 ' (A) TELEPHONE : 202-SOB-9100 
Sl| TELEFAX: 202-506-9295 

(2) INFORMATION FOR SEQ ID NOils 

(i) SEQUENCE CHARACTERISTICS: 
{ } (A) LENGTH: 960* ba.e pal" 

(B) TYPE: noclaie acid 

(C) STRANDEDNESS: doubla 

(D) TOPOLOCT: linear 

(ii) MOLECULE TTPE: CDNA 

(vi) ORIGINAL SOURCE? 

(A) ORGANISM: Hon© aapxena 
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(vU) IMMEDIATE SOURCE- 

(B) CLOWE: OP2.5(APC) 

iix) FEATURES 

{K) NAME /KEY' COS 

(B) LOCATION; 34.. 8562 

<xi, SEQUENCE DESCRIPTION: SEQ ID NO:!: 
OOKCTCCOAA ATCACCTCCA ACCCTACCCA AOO ATG CCT CCA OCT TCA TAT CAT 

1 5 

assssssssssassss 

10 15 

r»f xit TCC AAT CAT CTT ACA AAA CTC CAA ACT 

S IK K S ill «. u. »; - «■ «» 

25 30 

_ - T * rr^ AAA CAA CTA CAA CCA ACT ATT 

S S E2SSSS2S «; - «■ »>■ •« <>; 

40 4S 

_ .„ c ~ TCT TC T CCA CAC ATT CAT TTA TTA CAC CCT 

SS S IS S S S S E .iy «; " ? ,.u oi« xr 5 

60 

—ft r»r *ee ACT AAT TTC CCT CCA CTA AAA CTC 

S »■ ft S - 2 S £ SS £ » » *g - 

75 60 

105 11 

^ — - r«ri xxr CCA ACC ACA CAA AST ACT CCA TAT TTA CAA 

S s s s s » s s. « «- a «- •» ** - a 

120 125 

/•it <*tt eifi XXX CAC ACC TCA TTG CTT CTT CCT CAT CTT CAC AAA CAA 
!K S3 cu ^ S £ £ x- t« ft AU A. P L.« A. P Ly. dl. 

140 

170 175 

1S5 



54 



102 



150 



198 



246 



294 



342 



390 



438 



4S6 



S34 



582 



630 
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ATC CAA 
Met Clu 
200 

CGA AGA 
Arg AT9 



205 



ATA CCC 
lie XI* 



ews C XA ATC CAA AAC CAC ATA CTT CCT ATA 
££ JS S J" ;l- L,. A.p lie jg IX. 

220 22S> 



CCA CAC 
Arg cm 



AAC AAC 
Aen Lye 



CAA CCA 
Cln Cly 
265 

TCA ACT 

ser Thr 

280 

AGC ACA 
Sir Thr 



CTT TTA 
Leu Leu 
235 

CAT CAA 
His clu 
250 

GTC CCA 
ve>l Cly 



^ mrr r*Jk CCA ACA CAA CCA CAC AGC TCA TCT CAC 
§E E cS AU £ Clu AU Clu Ary Ser S.r Cln 
240 *" 

ACC CCC TCA CAT OA? CCT CAC CCC CAC AAT CAA CGI 
Sy Ser Hi. A.p AU eiu Arg Ola am Clu Ciy 

255 260 

CAA ATC AAC ATC CCA ACT TCT CCT AAT CCT CAC CCT 
32 55 En Alt Tht Ser Cly A.n Cly Cln Cly 

270 275 

mM «5A ATC CAC CAT CAA ACA CCC ACT CTT TTC ACT TCT ACT 
SS S5. 2j St. Clu Thr Al. Ser ».l Leu Ser S.r Mr 

M , frT ^ ret CCA ACC CTC ACA ACT CAT CTC CCA ACC AAC 
SS «S £ £ 3 2, Leu Thr S.r Hi. Leu Cly Tjr Ly. 

BSSSS2SS 1 « S !5 S S 25 S 

fcrT rrG cTA CCT ATC TCT ACC TCC CAA CAC ACC 

S 12 S £ S 2 5 H.t S.r Ser S.r Cln A.p ser 

330 - 35 



CAT CAT 
A.p Afp 



TCT ATA 
Cyt lie 
345 

TTA CAT 
Leu Hie 
360 

ACT AAA 
Ser Lye 



CAC TCA 

Hie Ser 



CAT CTT 
Hie Leu 



CAC CAA 
Cln Clu 
42S 



±«r <r a ri- TCT CCA TCT CTT CCT CTC CTC ATC CAC CTT 

Sr m£ 2g S. S 0» Cy. L.u »«» U. Leu II. Cln L.u 
350 358 

eee aat CAC AAA OAC TCT CTA TTC TTC CCA AAT TCC CCC CCC 

I™ S2 2p £ S| S.r v.l Leu Leu Cly A.n S.r Arg Cly 

365 

rr-r mis cefi ACC CCC ACT CCA CCA CTC CAC AAC ATC ATT 

SI All S S 5 S Kr Alt Alt L.u Hi. A.n 11. II. 

380 385 

395 *00 

t-c CAA CAC ATA CCC CCT TAC TCT CAA ACC TCT TCC CAC TCC 

F.t SS £n lie TA AU Tyr Cy. Clu Tnr Cy; Trp Clo Trp 
410 4 * S 

CCT CAT CAA CCA CCC ATC CAC CAC CAC AAA AAT CCA ATG CCA 

t" Hi. Oil P« Sly Met A.p Cln A.p Ly. A.n Pro Met Pre 
430 435 



678 



726 



774 



622 



670 



9ie 



966 



1014 



1062 



1110 



1158 



1206 



1254 



1302 



1350 
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_ ^ M . T CAA rXT CAC ATC TCT CCT CCT CTC TCT CTT CTA ATC AAA 
CCT CCT CTT CAA CAT CAC ATC tv ^ ^ ^ 

XX* Fro v.l Clu Hi.. Cln lit cy. r ^ ^ 

440 445 

... r^fi CAT AC A CAT CCA ATC AAT CAJL CTA CCC CCA 

s ss s s k ss s: £ s" » ; - •» «- »» $ 

. .~ TTA TTC CAA CTC CAC TCT CAA ATC TAT CCC 

Si S! E JS S SK 21 SS SJ V.l A. ? Cy. Clu jet Tyr Cly 
475 4.0 

CTT ACT WIT CAC CAC TAC ACT AIT AC* CTA ACA CCA TAT CCT CCA ATC 
S t2 En A.p Hi. Tyr S.r II. Thr Uu Ar, Ar 9 Tyr Al. Cly Met 



635 



ACA AAT CAC CAC CAC ACC CAA ATC CTA ACA CAC AAC AAC TCT CTA CAA 
Thr Ain Clu A.p Rii Arg cm XI. L.u Ar, Clu A.n A.« cy. L.« cln 
650 *** 60 

ACT TTA TTA CAA CAC TTA AAA TCT CAT ACT TTC ACA ATA CTC ACT AAT 
?S Su Uu Cin Hi. Uu Ly. S.r Hi. S.r to. Thr II. v*l S.r A.n 

$65 * 10 * 75 



1398 



1446 



1494 



1542 



1590 



err TTC ACA AAC TTC ACT TTT CCA CAT CTA CCC AAC AAC CCT ACC CTA 
AU itr A.n Su Thr Fhe Cly A.p V.l Ala A.n Ly. Ml Thr Leu 
505 510 5iS 

TCC TCT ATC AAA CCC TGC ATC ACA CCA CTT CTC CCC CAA CTA AAA TCT 1638 
III UTr £Z £ Sly Cy. K.t Arg Al. Leu V.l XI. Cln Leu Ly. Ser 
520 52* 530 " 

fi]UL ACT CXA CAC TTA CAC CAC CTT ATT CCA ACT CTT TTC ACC AAT TTC 1686 
c" s" c£ A.p Leu Cln Gin V.l lie Al. S.r V.l Leu Arg A.n Leu 
540 MS 880 

TCT TCC CCA CCA CAT CTA AAT ACT AAA AAC ACC TTC CCA CAA CTT CCA 1734 
5? Trp Arg Al* A.p VaX A« S«r Ly. Ly. Thr L«u Arg Clu V.l Cly 
555 560 5*5 



ACT CTC AAA CCA TTC ATC CAA TCT CCT TTA CAA CTT AAA AAC CAA TCA 1782 
a" vS Ly. AU £*u M.t Clu Cy. AU L.u Clu V.l Ly. Ly. Clu S.r 
570 575 »•*» 



ACC CTC AAA ACC CTA TTC ACT CCC TTA TCC AAT TTC TCA CCA CAT TCC 1830 
Th5 2u ly. Ser vii L.u S.r Al. L.u Trp A.n L«u S.r AU HI. Cy. 

Sas 590 595 

ACT CAC AAT AAA CCT CAT ATA TCT CCT CTA CAT CCT CCA CTT CCA TTT 1878 
iS 85 A-n Ly. Al. A.p II. cy. AU V.l A.p Oly Al. L.u Al. Ph. 

600 605 **o •*» 

TTC CTT CCC ACT CTT ACT TAC CCC ACC CAC ACA AAC ACT TTA CCC ATT 1926 
Leu V.l Cly Thr Leu Thr Tyr Xrg Ser Cln Thr A.n Thr Leu AU lie 
620 625 630 

ATT CAA ACT CGA CCT CCC ATA TTA CCC AAT CTC TCC ACC TTC ATA CCT 19?4 
lie Clu Ser Cly Cly Cly lie Leu Arg A.n V*l Ser Ser Leu lie Al. 
640 *45 
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~r «aa AAT CTC TCA CCA AC A AAT CCT AAA CAC CAC 

S 3 ffi S S «p £ Si - «• - " p S! 

(80 685 

• rrr m CTT ACC ATO CTC AAC AAC CTC ATT 
CAA CCA TTA TOO CAC ATC CCG CCA CTT ACC ^ ^ ^ n# 

ciu Al4 Leu trp Atp Met oiy ai* • 710 

700 

SSKSSSSSSSiSSSiSSS: 

730 735 

~. w CCA TCT CTT CAT CTT ACC AAA CAA AAA CCC 

£ S5 Sy C £ E S S £ 2 Hi. nl Ar f Ly. cm ty. Al. 

745 750 

s gi s k s S s s s 2 s aj b s s k 

760 765 

S5BSSI55SS25SSSSS 

795 »ww 

ss s s is s s s g s s ss s k s k 

*i* m AAT ACT ACA CTC TTA CCC ACC TCC TCT TCA TCA ACA MA 
Pro Tyr Su A*n Thr Thr vil Le« Pre S.r Ser S.r S.r Ser At 9 01, 
825 830 

52 Si SS5 £ S S Si » S g S S SK S ^ 

_„ e-e XAC TAC CAT CCA CCA ACA CAA AAT CCA CCA 
S SJ S S S? £ £ ffi >» AU Thr .1. A.n Pro Cly 

87S "° 

890 8,5 

— «m »rr CJUL TTA CAT TCT CTC ACA CAT GAG AGA 2790 

S C SSS S £ c£ S S *- VJX Thr A.p CXu A,, 



2166 
2214 
2262 
2310 
2358 
2406 
2454 
2502 
2550 
2598 
2646 
2694 
2742 



905 «° 
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m _ kCC TCT CCT CCC CAT ACA CAT TCA AAC ACT TAC 

js a s s s £ £ s - ». ». «. »« «- ~ m 

920 925 

pi a iat TCA AAT AGO ACA TCT TCT ATC CCT TAT 

£ £ £ K |« S S S; ffi ~ *• «< «« ;» ** 
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55 £ S S5 Su S Hi. Ly. XX. Hi. A.„ Hi. M.C A.p 



103 S 



1130 



xbt CAA CCT TAC TCT CAA CAA CAA CAC CAT CAA CAA CAA CAC ACA CCA 
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5 Alp til S5 S M. S«r Cly jggCl. 8.r Pro S.r Cl^A.n clu 
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w cfi ACA CCC AAA CAC ATA ATA CAA CAT CAA ATA AAA CAA ACT 
S SJ S5 S £ S. XUI1. Clu A.p Clu U. o Ly. Cin S.r 

... »g» cAA TCA ACC AAT CAA ACT ACA ACT TAT CCT CTT TAT ACT 3170 

SI SS Eg* SI £ £ En cin S.r Thr Thr Tyr Pro V*i Tyr Thr 

1065 1° 70 1 

cac ACC ACT CAT CAT AAA CAC CTC AAC TTC CAA CCA CAT TTT CCA CAC 
SS tlr £ Sp £ ty. Ri. U- 1*. >~ Hi. tte Cly Jin 

X080 10$S 40»w 

CAC CAA TCT CTT TCT CCA TAC ACC TCA CCC CCA CCC AAT CCT TCA CAA 
cS SI vll s.r^ro Tyr Arg s.r Ar^cly Al. A.n Cly s.rClu 

ACA AAT CCA CTC CCT TCT AAT CAT CCA ATT AAT CAA AAT CTA ACC CAC 3414 
£ En tl 9 vll Cly S.r A.n Hi. Cly II. A.n Cin A.n V*l S.r Cin 

1115 1120 

TCT TTC TCT CAA CAA CAT CAC TAT CAA CAT CAT AAC CCT ACC AAT TAT 3462 
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&i& <r*T XXT CAA GAG AAA CCT CAT CTC CAT CAG 

S £ Si S S y S SB 

1160 1165 

- ... TXT CCC ACA CAT ATT CCT TCA TCA CAG 

CCT ATT CAT TAT ACT TTA AAA TAT CCC ACA ^ ^ ^ CXn 

rro He A« P Tyr J« 0 L * U tjri Tyr Ai nes "*> 

£ S »« SS B S S SB IH S B S S B 
BBEBSSSSBBSBSSBE 

1210 

«~ /*»r rrr CAT CCA ACT TCT CCA CAG ACT AGA 

£ S £ S K £ 2 S J " iS.*- el " Mt 

1225 1230 
ACT CCT CAG CCT CAA AAC TCT CCC ACT TGC AAA CTT TCT TCT ATT AAC 
S.r Sly Oln Pro Cln Lyi Ala Alt iw cy« 7 J2JS 

1240 134S 

°* a " s* ss s & s: "J a s s £ £ s 

Cln clu Thr lit cln Th. Tyr c>i v*± w ? iJ70 
1260 

~* »m rn TTA TCA TCT TTG TCA TCA GCT CAA CAT CAA ATA 

2 i£ SJ S5 S S S 2* s« .x. ciu a «. xu 
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rr- XXT CAG ACC ACA CM CAA CCA CAT TCT CCT AAT ACC CTC CAA 

SI Sn Sn & tS Cln «- M. a-P }«»r I- Cln 

1290 12,5 

BBBBBBB.SSJSBBSB5B 

1305 1310 
_ «, r.TT CCA CCA CTC TCA CAO CAC CCT ACA ACC AAA TCC ACC 

S3 S« Si tVl £ K S «cr CI. Hi. ,re o Ar, Thr Ly. Ser 



1320 132S 

B S? S 5 SS BBS SS S £ S &S 
53 B B BBS SSSSSSS §E SS 

— . • CCT 6AA CAC TAT CTT CAG GAS ACC CCA CTC ATC 

S Eo £ 52 £ S £u Hi. Tyr V*l Cln Clu WW 1- *t 
1370 1375 
rT B __ .„ TCT ACT TCT CTC ACT TCA CTT CAT ACT TTT CAC ACT CCT 
55 S« £ gl £ SI v.l S.r S.r u- A.p S.r Ph. Clu S.r Ar, 

13flS I 390 
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3796 

3846. 
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3942 

3990 

4038 
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4182 
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— ~ m rxR ACT CAA CCA TCC ACT CCA ATC CTA ACT 4276 

is tn ?s s s as s as - & «» « « i & 

1400 1405 



— »~*> rrr CCA CAT ACC CCT CCA CAA ACC ATC 

{H 53 S 85 5 S S j :s ~ «, «. ~ - 

— rrsasisssssasss 

pro Pro S.r Arg S.r Ly. Thr Fro rr» 14<5 
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« M rei cat CCT CAT ACT TTA TTA CAT TTT CCC ACA CAA ACT 

SK E % Si g S S m * «. hi; o ^ •»» ;« 

1480 * 485 

act eCA CAT COA TTT TCT TCT TCA TCC ACC CTC ACT CCT CTC ACC CTC 
£ $To S S f « cy. st S.r S.rUu S.r AU U« Se^Uu 

ex* CXC CCA TTT ATA CAC AAA CAT CTC CAA TTA ACA ATA ATC CCT CCA 
Si tit Pro ™. 5S £n Ly A.P Vjl Clu Uu Ax 9 II. «t Pro Pro 
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r»c ei& XXT CAC AAT CCC AAT CAA ACA CAA TCA CAC CAC CCT AAA 
v" S» SS £ g !S £y j.» 8 .l. tte -lu S.r almoin Pro Ly. 
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4374 



442? 



4470 



4SI8 



4566 



4614 



4662 



rii ^ cxx xxc cAA CAC AAA CAC CCA CAA AAA ACT ATT CAT TCT 471C 

8H IS £S ciu Ly. Clu AU Clu ly. Thr II. X.p S.r 

1545 XS50 «« 



cir ctx TTA CAT OAT TCA CAT CAT CAT CAT ATT CAA ATA CTA 4756 

iu k £ 2S S *; g s« mp ».p m,m ? xi. .i- ix« g. 

1S60 1**5 * 

cax CXA TOT ATT ATT TCT CCC ATC CCA AC* AAC TCA TCA CCT AAA CSC 
CU Si OR iS S S.r AU Mt Pro ThrLy. S.r S.r Ar, ty. 01, 
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1S80 »85 15,0 

AAA AAC CCA CCC CAS ACT SCT TCA AAA TTA CCT CCA CCT CTC CCA ACC 48S4 
£ l?o Al. s Cln Thr Al* S.r jgU. Pro Pro Pro jgw. Ar, 

... «... . BT CTS CCT CTS TAC AAA CTT CTA CCA TCA CAA AAC ACS 4902 

"o jS 3! 1" v.l Tyr^Ly. Uu Uu Pro Jt^Oln Aon Arc 

TTC CAA CCC CAA AAS CAT STT ACT TTT ACA CCC CSC SAT CAT ATS CCA 4950 
ES Si 21 V.l Ser Ph. Thr Pro My Mp X.p Met Pro 

1625 163C *«• 
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rr - xrx ccT ATA AAC TTT TCC ACA OCT ACA 

5SSSSffi,SJSS.u a 

1640 1645 

— . »rr r.xx ICC CCT CCA AAT GAG TTA CCT CCT 

TCT CTA AST CAT CTA ACA ATC CAA T« CCT ^ ^ AU ^ 

Ser U« «•* *«P }JJj 0 Tbr 11 1665 "'0 

S Si S S S S SJ S 22 5 K SI K » 2! 

5 s sss ss ss 3,= * s = ~ ~ 
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KS5 

5046 



S|S2SS SS KSSSfSSSK 
S S S S S SS 5 J2 !2 EE SSS & 

1720 1725 * 

l»i %rr cxe AAG CCT TTC CCT OTC AAA AAC ATA ATC CAC CAC CTC CAC 
£ £5r £S S! at, v.1 ty. g. n. M.t A. P Gin moin 

~~r rfffi TCC TCT TCT CCA CCC AAC AAA AAT CAC TTA GAT CCT 
SSSS SS I« JS Ju P« A.n ty. A.„ Cln t.u A. P Oly 
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5142 



5190 



523B 



5286 



5334 
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5430 



»ir in XXC AAA CCA ACT TCA CCA CTA AAA CCT ATA CCA CAA AAT ACT 
i£ £ S Shr s.r » v.1 Ly. Pro U. J«C1* A.n Thr 

CJlX TAT AGC ACA CCT CTA ACA AAA AAT CCA CAC TCA AAA AAT AAT TTA 

SK Sr S S 5 V.1 A.n AU. A.p S.r Lys AH Mb U. 

j79S 1790 17,s 

AAT CCT CAC ACA CTT TTC TCA CAC AAC AAA CAT TCA AA5 AAA CAC AAT 4478 
SI SI SS v.1 Phe S.r A. P A.n Ly. A.p S.r ty. ty. Cln j.» 
KOO 1805 *oav 

•m im AAT AAT TCC AAC CAC TTC AAT CAT AAC CTC CCA AAT AAT CAA 5526 
^^Mnlul 5«rSS A.p Phe A.n A.p*. U« Pro A.n A.nSiu 



cat ASA flTC ASA CCA AC ' TTT CCT TTT CAT TCA CCT CAT CAT TAC ACC IS*4 

™ !S £ S «- j-** ' re Hi> Ihr 

CCT ATT SAA CCA ACT CCT TAC TCT TTT TCA COA AAT GAT TCT TTC AST 5622 
S c£ tlr Tto Pre Tyr cy. Ph. S« A.-, A.n A.p s.r u. s.r 
1850 185S i» eo 

TCT CTA SAT TTT SAT SAT CAT SAT CTT SAC CTT TCC ABC CAA AAC 6CT S670 
I" tnl lip A.p A.p A.p V.1 A.p t.„ S.r Ar, 01« ty. Al. 
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CAA TTA ACA AAC CCA AAA CAA AAT AAC CAA TCA CAC CCT AAA CTT ACC 
SKZ2 Arg L?t AXt Lye^Glu Aen Lye Clu Ser^Clu AU Lye Vel Th^ 

ACC CAC ACA CAA CTA ACC TCC AAC CAA CAA TCA CCT AAT AAC ACA CAA 
ser Hit Thr Clu Leu Thr Ser Aen Cln Cln Ser AU Aen Lye Thr Cln 
1900 1905 1910 

CCT ATT CCA AAC CAC CCA ATA AAT CCA CCT CAC CCT AAA CCC ATA CTT 
AU lie AU Lye Cln Pro lit Asn Arg Ciy Cln Pro Lye Pro He Leu 
X91S 1920 1925 

CAC AAA CAA TCC ACT TTT CCC CAC TCA TCC AAA CAC ATA CCA CAC ACA 
Cln Lye Cln Ser Thr Phe Pro Cln Ser Ser Lye Aep He Pro Aep Arg 
1930 1935 1940 

CCC CCA CCA ACT GAT CAA AAC TTA CAC AAT TTT CCT ATT CAA AAT ACT 
civ Ale Ale Thr Aep Clu Lye Leu Gin Aen Phe Ale lie Clu Aen Thr 
7 1945 19$0 1955 

CCA CTT TCC TTT TCT CAT AAT TCC TCT CTC ACT TCT CTC ACT CAC ATT 
Pro Vel Cye Phe Ser Hie Aen Ser Ser Leu Ser Ser Leu Ser Aep He 

I960 . 1965 1970 1975 

CAC CAA CAA AAC AAC AAT AAA CAA AAT CAA CCT ATC AAA CAC ACT CAC 
Aep Cln Clu Aen Aen Aen Lye Clu Aen Clu Pre He Lye Clu Thr Clu 
1980 1985 1990 
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CCC CCT CAC TCA CAC CCA CAA CCA ACT AAA CCT CAA CCA TCA CCC TAT 6054 
Pro Pro Aep Ser Cln Cly Clu Pro Ser Lye Pro Cln AU Ser Cly Tyr 
1995 2000 2005 

CCT CCT AAA TCA TTT CAT CTT CAA CAT ACC CCA CTT TCT TTC TCA ACA 6102 
Ale Pro Lye Ser Phe Hie Vtl Clu Aep Thr Pro Vel Cyi Phe Ser Arg 

2010 201S 2020 

AAC ACT TCT CTC ACT TCT CTT ACT ATT CAC TCT CAA CAT CAC CTC TTC 6150 
Aen Ser Ser Leu Ser Ser Leu Ser He Aep Ser Clu Asp Aep Leu Leu 

2025 2030 203S 

CAC GAA TCT ATA ACC TCC CCA ATC CCA AAA AAC AAA AAC CCT TCA ACA 6198 
cln Clu eye He Ser Ser Ala Met Pro Lye Lyt Lye Lys Pro Ser Arg 
2040 2045 2050 2055 

CTC AAC CCT GAT AAT CAA AAA CAT ACT CCC ACA AAT ATS CCT CCC ATA 6246 
Leu Lye Cly Aep Aen Clu Lys Kie Ser Pro Arg Atn Ket Cly Cly He 
2060 2065 2070 



TTA CCT GAA GAT CTC ACA CTT CAT TTC AAA CAT A*"A CAC AGA CCA CAT 6294 
Leu Cly Clu Aep Leu Thr Leu Aep Leu Lyt Aep I t Gin Arg Pro Aep 
2075 2080 2065 

TCA CAA CAT CCT CTA TCC CCT CAT TCA CAA AAT TTT GAT TCC AAA CCT 6342 
Ser Clu Hit Cly L«u Ser Pro Aep Ser Clu Aen Phe Aep Trp Lye Ale 
2090 2095 2100 

ATT CAC CAA CCT CCA AAT TCC ATA CTA ACT ACT TTA CAT CAA GCT CCT 6390 
He Cln Glu Gly Ale Aen Ser He v&l Ser Ser Leu Hie Cln AU Ale 
2105 2110 2115 
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GCT TCT CCA AAA ATC TCA TAT ACA TCT «* ^ c)ft 

cxy s.r GXy Ly. M-t s.r Tyr Thr s.r rr ^ ^ 
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"y S.r |Ss«r S« c Xle L * U g* 5 KU S-r *" ClU 'Ho * 
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tu i£ sS OlS eft Ly. Hi. VI A.n S.r IX. S.r CXy Thr ly. 
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{il $ S En £ V.X U. Uu «.r Ar 9 H.t S.r S.r Thr Ly. 

2430 2415 *** W 

^ XCT C xA TCT CAT ACA TCA CAA ACA CCT CTA TTA CTA CCC 

S £ !S K cS IS A.pArg S.r Cl„ Ar, Pro V.X Uu V.! Ar, 

2425 2430 ««« 

«r ACT TM ATC AAA CAA CCT CCA ACC CCA ACC TTA AGA ACA AAA 7398 
%t IS & 25 S & OX- AX* fro S.r Pro Thr Uu Ar, Ar, ly. 

2440 244$ *« aa 

we CAC CAA TCT CCT TCA TTT CAA TCT CTT TCT CCA TCX TCT ACA CCA 

Su oil tu IS SI S Ph. Glu s.r Uu S.r Fro S.r Ser Ar, fro 
2440 8465 

CCT TCT CCC ACT ACC TCC CAC CCA CAA ACT CCA CTT TTA ACT CCT TCC 7494 
SI III P« JS ire S.r CXn AX. CXn Thr Pre v.: Leu S.r Pro S.r 
247S 2480 ««« 

CTT CCT CAT ATC TCT CTA TCC ACA CAT TCC TCT CTT CAC CCT CCT CCA 7S42 
SI S A.p Sit IS Uu S.r Thr Hi. S.r s.r v.l Gin Al. Cly CXy 
2490 249S «»W 

TCC CCA AAA CTC CCA CCT AAT CTC ACT CCC ACT ATA CAC TAT AAT CAT 7S90 
Trp Ar, Ly. Uu Pro Pro A.n Uu S.r Pro Thr II. CXu Tyr Am A.p 
2505 25X0 23X5 

CCA ACA CCA CCA AAC CCC CAT GAT ATT CCA CCC TCT CAT TCT CAA ACT 763S 
S} SS AU 5. Ar, Hi. A.p IX. AX. Ar, S.r Hi. S.r Clu S.r 
2520 2S2$ 2530 2535 

CCT TCT ACA CTT CCA ATC AAT ACS TCA CCA ACC TCC AAA CCT CAC CAC 7686 

Pro I" 
2540 
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S« SI IS I« S.r to. Pro Arg V.X S.r Thr Trp Ar, Ar, Thr 
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r-AA CTA TCC CCA AAA CCA ACA TCC ACA AAA ATA 

as s £ ™ K - JS.~ "» w " 5SI. 

2600 2605 
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an ?s s a: s 2 - « - &~ s " «* tM 

_~ CCT CAA TCA AAG ACT CTA ATT TAT CAA ATC 

E til SI iS S« £ 52 Si S. ty. «- a. n; «n 

2635 *©«w 
— ^— *r-r in ACA GAG OAT GIT TCC CTC ACA ATT CAG CAC 

S S S SS £ £ £ £ up vix Tr P v.i jgn. ex. x. P 
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£ S ffi £ 85 £ £*r A.n «,«.r g. XI. v.l CI- xr, 
2745 2750 *' 9S 
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£ £ £ £ £ S«S.r S.r s.r Ly. JU S « S.r Pro Ser CXy $ 
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2665 2670 
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SS S S 5 5S I., ci. ju« «• & 
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£ £ £ £ £ 1U Ly. Cl„ A.n v.l Cly A.n Cly S.rm 
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8214 



~ /-w tvrr TTC CAA AAT CCC CTC ACC TCC TTT ATT CAC 

£ £ S % vS £ £ £ ^ U. Thr S.r jg IX. cxn 

211$ 2720 

rTr rlT Q rr CCT CAC CAA AAA CCA ACT CAC ATA AAA CCA CCA CAA AAT 6262 
CTC CAT CCC CCT SAV, w c . 
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2760 

% — — eee »cx CTO ACT CCT TTT AAT TAC AAC CCA ACC CCT ACC 

£ !S £ SS £ £ £ Sro fh. Mn Tyr A.n Pre S.r Pro Arg 

2780 27 *» 

aa acc ACC CCA CAT ACC ACT TCA CCT CCC CCA TCT CAG ATC CCA ACT 84 54 

£ £ £r S 2J K Thr S.r Al^Aro Pro Ser Cln mPro Thr 

flT ~ XXT AAC AAC ACA AAC AAS CCA CAT TCC AAA ACT 0AC ACC ACA 

S £ £ £ £ Thr ly. ty HP S«r ly. MrA.p s.r Thr 
28X0 281S 2820 

»v eei lee CAA ACT CCT AAC CCC CAT TCT CCC TCT TAC CTT 
£ £ £ £ £ ^ £ ?ro Ly. Arg Hi. f .r - Oly s.r Tyr U. 
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CTC ACA TCT CTT TAAAAGACAC CAACAATCAA ACTAAGAAAA TTCTATCTTA 

Val Thr *«r V4X 
2640 

ATTACAACTG CTATATAGAC ATTTTCTTIC AAATGAAACT T7AAAAGACT CAAAAATTTT 8662 

CTAAATAGGT TTCATTCTTC TTACACGCTT TTTGTTCTGG AAGCCATAT7 TGATAGTATA 8722 

CTTTCTCTTC ACTCCTCTTA TTTTCGGAGC CACTCTTCAT GG7TACCAAA AAA7AGAAAG 8762 

CCAACTATCT TTCTACACTA TCTTTTACAT CTATTTAAAG TACCATCCCA TCCCAACTTC 6642 

CTTAATTATT CCTTCTCTAA AAIAATCAAC ACTACACATA GCAAATATCA TATATTCCTC 6902 

TTATCAATCA TTTCTAOATT ATAAACTCAC TAAACTTACA TCAGGCCAAA ATTCCTATTT 8*62 

ATCCAAAAAA AAAATCTTTT TCTCCTTCTC ACTCCATCTA ACATCATAAT TAATCATCTC 9022 

CCTCTGAAAT TCACACTAAT ATCCTTCCCC ATCAACAACT TTACCCAGCC TCCTTTGCTT 9082 

ACTCCATCAA TCAAACTOAT 0C77CAATT? CACAACTAAt CATTAACACT TATGTGGTCA 9142 

CATCATCTCC XT AG AC AT AC CT AC AC TCT A ATAATT7ACA CTATTTTGTG CTCCAAACAA 9202 

AACAAAAATC TCTCTAACTC IAAAACATTC AA7GAAACTA TTTTACCTGA ACTACATTTT 9262 

ATCTCAAACT AGGTAGAATT ITTCCTATCC TCTAATTTCT TCTATATTCT GGTATTTGAG 9322 

CTCACATCCC TGCTCTTTA7 TAA7CACACA TGAATTGTGT CTCAACACAA ACTAAATGAA 9382 

CATTTCACAA TAAATTATTC CTCTATOTAA ACTCTTACTC AAATTCCTAT TTCTTTCAAC 9442 

GGTTTGTTTC ACATTTCTA7 TAATTAATTC TTTAAAATGC CTCTTTTAAA ACCTTATATA 9502 

AATTTTTTCT TCAOCTTCTA 7CCATTAACA CTAAAATTCC TCTTACTGTA ATAAAAACAT 9562 

TCAACAACAC TCTTCCCACT TAACCATTCC ATCCCTTCCC ACTT 9606 

(2) INFORMATION FOR SZQ ID H0(2i 

(i) SEQUENCE CHAAACTEAISTXCS: 

(A) LENGTH: 2843 ewLno acid* 
(8) TTPE: aaino acid 
(D) TOPOLOGY* lintar 

(ii) MOLECULE TTPEt protein 

(xi) SEQUENCE OlSCRIPTIONt SEQ ID NO:2: 

Met Ala Ala Ala Ser Tyr At? Gin Leu L«u Lyt Cln Val Clu Ala Uu 

1 5 iO 15 

Lyt Met Clu Atn Ser *»n L«u Arg Gin Clu Leu Clu Atp Atn Str Am 

20 25 30 

Hit Leu Thr Lye Leu Clu Thr Clu Ala Ser Aan Mex Lyt Clu Val Leu 
35 40 45 

Lvt Cln Leu Cln Cly Ser lit Clu Atp Clu Ala Met Ala Ser Ser Cly 
50 55 60 
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OX. Xl« A-P «■" ^ a \ v 0 Ar9 UU LY ' " U *' P 

Atn Ph. Pro Oly v.l Ly. Uu Arg S.r Ly. Met S.r Uu Arg S.r lyr 

cir s« Arg 01- Olr «« "J S " Cly " U gj § " 

v.l Pro M.t Oly s.r Ph. Pro Arg Arg Cly Ph. V.I A.« cly s.r Xr, 



11s 



Glu S.r Thr Cly Tyr uu A. Clu U» Clu ly. Clu Arg $.r Leu Uu 

130 135 
i« Al. A.p f« A.p Ly. cl« Clu Ly. Clu Ly. A.p Trp Tyr Tyr AL 
145 150 

dn Uu Cln A.n Uu Thr Ly. Ar, XX. MP «•* L.u Pre Uu Thr Clu 
A.n Ph. S.r Uu «n Thr A.p Uu ttr Arg Arc Cln Uu Clu Tyr Clu 
XI. Arg Cln XI. Arg v.l Al. Met Clu Clu Cln L.u Cly Thr Cy. Cln 



195 



275 



A.p M.t Clu Ly. Arc. AL Cln Arg Arg II. AL Arg 21. Cln Cln XI. 

210 2i5 

Clu Ly. A.p II. L.u Arg II. Arg Cln Uu Uu cir. S.r Cln Al. Thr 

225 230 2J * 

Clu Al. Clu Arg S.r S.r Cln A.n Ly. Hi. Clu Thr Cly S«r Hi. A.p 

24$ *" 

Al* Clu Arg Cln A.n Clu Cly Cln Cly V.l Cly Clu 11. A.n M.t AL 

tnr s« Cly A.n Cly Cln Cly s.r Thr Thr Arg M«t A.p Hi. Clu Thr 

Al. s.r v.l Uu scr S«r s.r ser Thr Hi. S.r AL Pro Arg Arg Uu 

290 295 

Thr s.r Hi. U« Cly Thr Ly. v.l clu M.t v.l Tyr S.r Uu Uu s« 

30S 310 

Met L.u Cly Thr Hi. A.p Ly. A.p A.p M.t S.r Arg Thr Uu Uu AL 

325 333 

Met S«r S«r S.r Cln A.p s.r Cy. IL S.r Met Arg Cln S.r Cly Cy. 
3 40 345 " u 

Uu Pro Leu Uu II. Cln Uu L.u Hi. Cly A.n A.p Ly. A.p s.r v.l 

355 360 J" 

Uu Uu Oly A.n S.r Arg Cly S«r Ly. Clu AL Arg AL Arg AL S.r 

370 375 3B 

Al. AL Uu Hi. A.n II. XI. Hi. S« Cln Pro A.p A.p Ly. Arg Cly 

365 390 
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Arg xr, Clu II. Arg Vtl Uu HI. Uu Uu Clu Gin XI. Arg Ala Tyr 

Cyt Clu Thf Cy. Trp Clu Trp Cln Clu AL HI. Clu Pre Cly H.t K.p 

1 42Q 425 4J0 

Cln A.p ly. A.n Pro Met Pro All Pro V.l Clu Hi. Cln XL Cy. Pro 

435 440 445 

Ala Val Cye Val Uu Met Lye Uu Ser Phe Aep Clu Clu Hit Arg Hie 
450 4S5 460 

Mi Mat A.n olu Uu Cly Cly Uu Cln Ala Xle Ala Clu Uu Uu Cln 
465 470 480 

Val Aep Cy« Clu Met Tyr Cly Uu Thr Aen Asp Hie Tyr Ser lie Thr 
4gS 490 495 

Leu Arg Arg Tyr Ala Cly Mat Ala Uu Thr Aan Uu Thr Phe Cly Aap 

5Q0 505 510 

Val Ala Aan Lye Ala Thr Uu eye sar Mat Lye Cly Cye Mat Arg Ala 

515 520 525 

Leu Val Ala Cln Uu Lye Ser Clu Sar Clu Aap Uu Cln Cln Val Ila 
530 535 540 

Ala Sar Val Uu Arg Aan Lau sar Trp Arg Ala Aap Val Aan Sar Lye 

545 550 555 560 

Lya Thr Lau Arg Clu Val Cly Sar Val Lya Ala Lau Mat Clu Cye Ala 

565 570 57S 

Lau Clu Val Lye Lye Clu Sar Thr Lau Lya Sar Val Uu Ser Ala Uu 
S80 5» 690 

Trp Aan Lau Sar Ala Hie Cys Thr Glu Aan Lya Ala Aep Ila Cya Ala 

r 595 600 60S 

Val Aap Cly AU Lau Ala Phe Leu Val Cly Thr Lau Thr Tyr Arg Sar 
610 *15 620 

Cln Thr Aan Thr Uu Ala Ila lie Clu Ser Cly Cly Cly Ila Lau Arg 
625 630 63S 640 

Aen Val Sar Sar Uu Ila Ala Thr Aan Clu Aep Hie Arg Cln lie Lau 
645 550 655 

Are Clu Aan Aan Cya Uu Cln Thr Lau Lau Cln Hie Lau Lya Sar Kie 

v 660 665 670 

Ser Lau Thr Ila Val Ser Aan Ala Cyt Cly Thr Leu Trp Aan Leu Ser 
675 683 685 

Ala Arg Aan Pro Lya Aep Cir. Clu Ala Lau Trp Ae? Mtt Cly Ala val 
690 695 700 

Ser Met Leu Lyi Aan Uu He Hie Ser Lya Kie Lye Met He Ala Met 

70S 710 715 720 

Cly Ser Ala Ala Ala Uu Arg Aen Lau Met Ala Aan Arg Pro Ala Lye 

725 730 735 
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tyr Ly. MP AU A.n II. M.t Ser Pro Cly S.r s.r Leu Pro »er Leu 
Hi. v.l xr, ty. el. Ly. Ala U- Clu Al. ei. f« A.p au cm Hi. 

7S5 760 
^ s.r Clu Thr Ph. A.p A.n II. A. ? A.n Leu Ser Pre ty. AU S.r 

770 775 
«U Arg S.r ty. Gin Arg Hi. ty. dn Ser Leu Tyr cly A.p Tyr vjl 

Ph. A.p Thr A.« Arg Hi. A.p A.p A.n Arg s.r A.p A.n Ph. A.n Thr 

•OS OAW 

Cly A.n M.t Thr V.l L.u S.r Pro Tyr U. A.n Thr Thr V.l L.u Pro 

•* g20 825 

Ser Ser S«r S.r S.r Arg Gly S.r L.u A.p S.r S.r Arg S.r Clu Ly. 

gjj 840 

A.p Arg S.r Uu Clu Arg Clu Arg Cly XI. «ly t.u Cly A.n Tyr Hi. 
SSO * ss 

Pro AU Thr Clu A.n Pro Cly Thr Ser S.r Ly. Arg Cly t«u Cln II. 

£65 $70 

S«r Thr Thr AU Al. Cln lie AU ty. V.l Met Clu Clu val Ser Ala 

685 .90 

U. Hif Thr S«r Cln Clu A.p Ar9 S.r s.r Cly S.r Thr Thr Clu L.u 
900 '°* ylv 

Hi. Cy. Val Thr A.p Clu Arg A.r. Ala L.u Arg Arg S« Ser AU Ala 

9^5 9J0 "25 

Hit Thr Ki. ser A.n Thr Tyr A.n Ph. Thr Ly. Smr Clu A.n Ser Asa 
930 935 940 

Arc Thr Cy. S€r Met Pro Tyr Alt Ly. Uu' Clu Tyr Ly. Ar 5 Ser Ser 
94 | 9S0 9SS **0 

A.n A.p Ser Uu A.n Str Vil S.r s.r A.n A.p Cly Tyr Cly ty. Arg 
9(5 970 

Cly Cln Mtt Lyi Pro S«r II. Clu Sir Tyr S.r Clu A.p A.p Clu S.r 
90C 985 "0 

LY9 Phe Cy. ser Tyr Cly Cln Tyr Pro AU A.p Uu Alt Hi. ty. Ut 

995 1000 lows 

HI. Ser Ala A.n Hi. Het A.p A.p A.n A.p Cly Clu Leu A.p Thr Pre 

1010 * 01S 102w 

Il« A.n Tyr S.r t.u ty. Tyr S.r A.? Clu Cln Leu A.n Ser Cly Arg 
J02S 1^30 I"** - u " u 

cln Ser Pro S«r Cln A.n Clu Arg Trp AU Arg Pro ty« Hi. II. II. 

1045 1°S° 1058 

Clu A.p Clu II. Ly. Cln S«r Clu Cln Arg Cln Ser Arg A.n Cln S.r 
10SC * 0S5 10 
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Thr Thr Tyr Pre V.l Tyr Tnr Clj. s.r Thr A.? Mp Ly. Hi. L.u Ly. 

1075 JOoO Awoa 

Phe Cln Pro Hit Phe Cly Gin Gin Clu Cye val ter Pro Tyr Arg Ser 

1090 1095 
Arg Gly XU AM Cly Ser Clu Thr ten Ar 9 y*l Gly Ser A.n Hie Cly 
1105 1110 1115 120 

lie Aen Cln Aen Val Ser Gin Ser Leu Cyt Gin Glu Aep Aep Tyr Glu 
1125 1130 1135 

Aep Aep Lye Pro Thr Aen Tyr Ser Clu Are Tyr ser Clu Clu Glu cln 
1140 IKS 1150 

Hie Clu Glu Clu Clu Arg Pro Thr Aen Tyr Ser He Lye Tyr Aen Glu 
1155 1160 1165 

Clu Lye Arc Hie Val Aep Cln Pro He Aep Tyr Ser Leu Lye Tyr Ala 

H70 ins liao 

Thr Aep He Pro Ser ser Cln Lye Gin ser Phe Ser Phe Ser Lye Ser 
lies n*o 1195 1200 

Sft ~ st Gly Gin Ser Ser Ly» Thr Clu Hie Met Ser ser S«r ser Clu 

1 1205 1210 1215 

Aen Thr Ser Thr Pro Ser Ser Aen Ale Lye Arg Gin Asn Cln Leu Hie 

1220 1225 1230 

Pro Ser Ser Ale Gin S«r Arg Smr Gly Cln Pro Cln Lye Ale Ale Thr 
1235 1240 1245 

Cye Lye Vel Ser Ser He Aen Oln Glu Thr He Cln Thr Tyr Cye Val 
1250 1255 1260 

Clu Aep Thr Pro He Cye Phe Ser Arg Cye Ser Ser Leu Ser Ser Leu 

1265 1270 127S 12S0 

S«r Ser Ale Glu Aep Glu He Cly Cye Asn Cln Thr Thr Cln Clu Ale 
128S 1290 1295 

Aep Ser Ala Aen Thr Leu Cln He Ala Clu He Lye Cly Lye lie Cly 
1300 130S 1310 

Thr Arg S«r Ala Clu Aep Pro V*l Ser Clu Val Pro Ala Val Ser Cln 
1J1S 1320 1325 

Hie Pro Arg Thr Lye Ser Scr Arg Leu Cln cly ser ser Ley ser S«r 

1330 1335 1340 

Clu Ser Ale Arg Hie Lye Ala Val Glu Phe Pro ser Cly Ala Lye Ser 
1345 1350 13S5 1360 

Pro Ser Lye Ser Cly Ala Cln Thr Pro Lye Ser Pro Pro Glu Hie Tyr 
1365 1370 1375 

Val Cln Clu Thr Pro Leu Met Phe Ser Arg Cye Thr Ser Vai Ser Ser 
1380 1385 139C 

Leu Asp Ser Phe Clu Ser Arg ser Xle Ala Ser Ser Val Gin Ser Glu 
1355 1400 1405 
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Pre Cy. S.r Cly Het V.l Ser Cly He U. «« Pres., A. P Leu Fro 

1410 

A.p Ser Pre Cly Cln Jg »* Pro Pro Ser jg^r Ly.-Thr Pro j„ 

^ %* -h- Lvt Aro Clu v«i Pro Lyi Ain Lyi 

Pro Pro Pro cm Ola .h. l*» £| o <"> i4S5 



1445 

Al4 Pro thr Al. Clu Ly. Arc Clu S.r Oly. Pro Ly. Cln Al.M. V.l 

1460 

A.n Ale f uv«l «• ^ ™ JS™ ^ '° ** P *" 

Utt Hi. Ph. Al. Thr Clu Ser Thr Pro A.p Cly JJjtar Cy. Ser S.r 
14S0 



S « L.u S.r AU Leu Ser leu A.p Clu Pro J. IU Cln Ly. A.p 

1S05 x » iU 

Clu L.u Arc Ue He^Pro Pro v.l Cln Clu^A.n A.p A.n Cly A^Ciu 

Thr Clu s.r Clu Cln Pro Ly. Clu s.r A.n Clu A.n Cln Clu Ly. Clu 

1M0 1545 

»U Clu ty. Thr II. A.p s.r Clu Ly. A.p Leu Leu A.p A.p S.r A.p 

1555 15o0 

A.p A. P A.p II. Clu II. Leu Clu clu Cy. tie lie Ser Ai. K.t Pro 

1S70 1575 
Thr Ly. S.r S.r Arc Ly. cly Ly. Ly. Pro Al. Cln Thr Al. S.r Ly^ 

1S8S " 90 

Leu Pro Pro Pro v.l Al. Arc Ly. Pro Serbia Leu Pro V.l VfvLy. 

loU9 

Leu L.« Pro S.r Cln A.n Arg L.u C In Pro cln Ly. Hi. v.l s.r Phe 

1620 1615 *'* 

Thr Pro Cly A.p A.p «et Pro Arg v.l Tyr Cy, v.l Clu Cly Thr Pro 

IMS 16< - A 

II. A.n Ph. S.r Thr Al. Thr Ser L.u S.r A.? Leu Thr lie Clu Ser 

isso I' 5 * * 

ProPro A.n Clu Leu a.^L Cly Clu Cly V.^Arg Cly Cly Al. Cln^ 

ser Cly Clu Ph. Clu^Ly. Arg A.p Thr Il^Pro Thr Clu cly ArgS.r 

Thr A.p Clu Al^Cln Cly Cly Ly. T^S.r Ser V.l Thr Xl^Pro Clu 

Leu A.p Asp A.n Ly. Ala Civ Clu Cly A.p II. L.u Al. Clu Cy. II. 

1115 - 720 

A.„ ser Al. Met Pro Ly. Cly ly. s.r Mi. Ly. Pro Phe Ar, val Ly. 

1730 173J 1 
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Ly. lie Mt* A.p Cln Jal^Gln cln hU Uz J}j $ 5er Ur *** M * ^lo 

A.n Ly. A.n Cln Leu A.p Cly Ly. Lye Ly. Ly. ^ Thr $tr Pro Vtl 

17S5 1770 _ 1775 

tvs Pro He Pro Cln A.n Thr Olu Tyr Arg Thr Arg V 4 1 Arg Ly. A.n 
7 1780 i7fi * * 

Alt A.p Str Ly. A.n A.n Ltu A.n Ala Clu Arg Vtl Phi Scr A.p A.n 
1735 1800 1*0$ 

rvi xte Str Ly. Ly. Cln Atn Leu Lye Ain Afn Ser Ly. Asp Phe Am 

1810 m* x «0 

Aip Ly. Ltu Pro Atn A.n Olu Asp Arg Vtl Arg Cly Str Ph. Ala Pht 
1825 1630 IMS 1*40 

A.p Ser Pro Hi. Mi. Tyr Thr Pro lit Clu Cly Thr Pro Tyr Cy. Phe 
* 18 45 18S0 1855 

Ser Arg A.n A.p 8«r L«u S.r Ser Leu A.p Ph. A.p A.p A.p A.p Vtl 

* i860 1865 1870 

A.p Leu Ser Arg Clu Ly. Al. Clu Lou Arg Ly. Al. Ly. Clu A.n Ly. 
r 1875 1M0 1885 

Clu S.r Olu Al. Ly. Vtl Thr Scr Hi. Thr Clu L.u Thr Str Asn Cln 
1890 1895 1900 

Cln Ser Al. A.n Ly. Thr Cln Al* II. Alt Ly. Cln Pro II. A.n Arg 

1905 1910 1915 1920 

Civ Cln Pro Ly. Pro II. L.u cln Ly. Cln S.r Thr Phc Pro Cln Ser 
J 1925 1930 1935 

Ser Ly. A.p He Pro A.p Arg Cly Al. Al* Thr A.p Clu Ly. Leu Cln 
1940 1945 1950 

A.n Ph. Alt lie Clu A.n Thr Pro Vtl Cy. Ph. S.r Hi. Asn Ser Ser 
1955 1960 1965 

Leu Ser Ser Leu Ser A.p lie A.p Gin Clu A.n A.n A.n Ly. Clu A.n 

1970 1975 1980 

Clu Pro II. Ly. Clu Thr Olu Pro Pro A.p Ser Gin Gly Clu Pro Ser 
1985 1990 1995 . 2000 

Lv. Pro Cln Al. Ser Cly Tyr Alt Pro Ly. Ser Phe Hi. Vtl Clu A.p 

2005 2010 2015 

Thr Pro Vtl Cy. Ph. Ser Arg A.n Ser Ser L.« Str S.r Leu Ser lie 
2020 2025 2030 - 

Asp Ser Clu A.p Asp Leu Leu Gin Clu Cy. lie S.r Ser Al. Met Pro 
2035 2040 2045 

Ly. Ly. Ly. Ly. Pro Ser Arg Leu Ly. Cly Asp A.n Clu Ly. Hi. Ser 

7 2050 2055 2060 

Pro Arg Asn Met Cly Cly lie Leu Cly Clu A.p L.u Thr Leu A.p Leu 
2065 2070 2075 2080 
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elu M „ w a «, «. «. »• «»-,"» «* fK."' 

„ t ,« M r«. »» «• «■ »• »• «»• «- s,-« c " 

2115 2120 

,„ », s« "• "• »« •« - U ' " r *" 

2130 2135 
CXy S.r Fro Ph. Hi. I*. Thr Pro M» Cln JtaCta Ly. Pro IM I* 
214$ 2150 

- ♦ „« Pro Civ Clu ly. S«r Thr Ltu 
S.r Mn Ly. Ciy Pro Kcq Xi« L*u «*• rff 0 Ciy 7 2175 



2165 



Clu Thr 



ly , Ly. n. cxu s.r clu s.rLy. ciy ». Ly. jjj «r «*• 



2180 



ty . V.1 Tyr Ly. S.r L.u XI. thr Ciy Ly. v.l Arc S.rA.n S.r Clu 

2195 2200 
II. s.r Ciy Cln M.t ly. 01. fro Leu M. Al. A-n^t Pro S.r U. 

2210 22i5 
S.r Xr C Ciy Ar, Thr » XI. Hi. XI. ^o eg v.l Ar, A.n s.r 

2225 2230 . 

S.r ser Thr S.r fro v.l S.r Ly. Ly. aiyfro «*• g/" 



2245 



U S.r Ly. S.r >ro Ser Clu Ciy jg»r Al. Thr Thr grPro Arc 



22(0 



Ciy Al. g. s Pro 5« V.1 Ly. SerClu Leu Ser Pro V.^Al. Ar, Cln 

Tnr S.r Cir. He Ciy Ciy S« S.r Ly. Al. Pro S« Arc S.r Ciy S.r 

2250 2295 
Xr, MP S.r Thr Pro S.r Arc Pro XI. Cln Cl^Pro L.u S.r Ar, J* 

2305 2310 

,1. Cl« S.r Pro Cly $ Ar, A.» S.r II. S.r>ro Ciy Arc A.n 61^11. 

ser Pro Pro A.nLy. L.« S.r cln j» fr. Ar, Thr S.r S.rPro S.r 

Thr * u s« s Thr Ly. S.r S.r Closer Ciy Ly. M.t S.rTyr Thr S.r 

Pre aly Q Ar« Cln M.t s.r g» «■ A.n Leu Thr Ly.^Clr. Thr ciy L.u 

S.r s Ly. A.n Al. S.r Jjrtt. Pro Ar, S.r Clu^Scr Al. S.r Ly. Cl^ 

t . u A.n Cin H«t A.« A.« Ciy A.n Ciy Al. A.n Ly. Ly. v.l Clu^u 
2405 ** AW 
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Ser Arg ntt ser ser Thr Lyt Ser *^ $ cl y Ser clu * ,r £jj 0 * r9 4cr 

Clu Arg Pro Vel Leu Vtl Arg Cln Ser Thr Phe lit Lyt clu Alt Pro 
2435 2440 2445 

ser Pro Thr Leu Arg Arg Lyt Leu Clu Clu Ser Alt Ser Phe Clu Ser 
2450 2455 246C 

Leu Ser Pro Ser Ser Arg Pro AW Ser Pro Thr Arg Ser Cln Ale Cln 
^465 2470 247S 2460 

Thr Pro Vel Leu Ser Pro Ser Leu Pro Atp Met Ser Leu Ser Thr Hie 
2485 2490 2495 

Ser Ser Vel Cln Ale Cly Cly Trp Arg Lye Leu Pro Pro Aen Leu Ser 
2SO0 2S0S 2510 

Pro Thr He Clu Tyr Atn Atp Cly Arg Pro Alt Lyt Arg Hit Atp lie 
2S1S 2520 2525 

Ale Arg Ser Hit Ser Olu Ser Pro Ser Arg Leu Pro He Atn Arg Ser 
2530 2535 2540 

Cly Thr Trp Lyt Arg Clu Kit Ser Lyt Hit Ser Ser Ser Leu Pro Arg 
2545 2550 2555 2560 

Val Ser Thr Trp Arg Arg Thr Cly Ser Ser Ser Ser He Leu Ser Ale 
2565 2570 2575 

Ser Ser Clu Ser Ser Clu Lyt Alt Lyt Ser Clu Atp Clu Lyt Hit Vel 
2580 2585 2590 

Atn ser He Ser Cly Thr Lyt Cln Ser Lyt Clu Atn Cln Vtl Ser Alt 
2595 2600 2605 

Lyt Cly Thr Trp Arg Lyt He Lys Clu Atn Clu Phe Ser Pro Thr Aen 
2610 2615 2620 

Ser Thr Str Cln Thr Vtl Ser Ser Cly Alt Thr Am Cly Alt Clu Ser 

2625 2630 2635 2640 

Lyt Thr Leu He Tyr Cln Met Alt Pro Alt Vel Str Lyi Thr Clu Atp 
2645 2650 2655 

vtl Trp vtl Arg He Clu Atp Cyt Pro He Atn Atn Pro Arg Str Cly 
266C 2665 2670 

Arg Ser Pro Thr Cly Atn Thr Pro Pro Vtl He Atp Str v«l Ser Clu 
267S 2680 2685 

Lyt Alt Atn Pre Atn He Lyt Atp Ser Lyt Atp Atn Cln Alt Lyt Cln 

2690 269S 2700 

Atn Vel Cly Atn Cly Ser V*l Pro Met Arg Thr V*l Cly Leu Clu Atn 
2705 2710 2715 2720 

Arg Leu Thr Ser Phe He Cln Vtl Atp Alt Pro Atp Cln Lyt Gly Thr 
2725 2730 2735 

Clu He Lyt Pro Cly Cln Atn Atn Pro Vel Pro V*l Ser Clu Thr Atn 
2740 274& 27ftQ 
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27SS 27 * 

.« S.r Cly Thr Vtl XI* XI* v *l * h * » ro ' h ' 

2770 277S 

A . ntyr ».„ ,„ ... «*. *. »< - ft/" " ,Bt - SSo 

«c «. m«. ~ » ~ j- « - « 
».p «. jg - - - «• sis," 5 e " tM " a*" "* 

„, .» S; «r "< «* - 5J,« - 

(2) .INFORMATION TO* *tQ ID NO: 3: 



rt) SEQtJRNCE CHARACTERISTICS: 
11 (K) LENGTH: 31W b*« P* ir * 
1) mtt nucltie »cid 
(C) STRANDEDNESS: doublt 

it) TOPOLOCST: linear 

(ii) MOLECOU TTP£J cONA 

(vi) ORIGINAL SOURCE: 

5 ; (A) ORGANISM: Homo fipitne 

(vtl) IMMEDIATE SOURCt: 

(B) CLONE: OPKTB2) 

fix) TEATORT: 

* (A) NAME/KEY: COS 

<B) LOCATION: l..«0 

<xi) SEQUENCE DESCRIPTION: SE« ID' NO: 3: 

~ <~rr TXT CCC CCA CIA CCA ACA OCC CCC CCN CCC 
E JS S E £ E £ S S «y m XU Pro cly Cly 

a 5 

S S K S S £ 5! S S S S £2 S K! S K 

50 55 

-„ — «r m CTC TTC OCT TAT CCA CCC TCT CTC CTC TGC 

S S K 25 £ 55 £ c?y tyr CI, xl* S,r Uu u. Cy; 



48 



96 



144 



192 



240 



65 70 
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w cac TXe CCA CCC TAC ATC TCA ATT AAA OCT ATA 286 

SS S J2 8} Phi !g £ So K Tyr U. ... II. *J. U. 

... -i* CXT CA7 ACC CAC TGC CTC ACC "TAC TCC CTA 336 

Si! X E SS 51 S; 2; el, trp Uu Thr Tyr Trp v.l 



364 



432 



489 



H5 120 A ** 

TCA ICC Tie CCC TTC TXC TAe ATC CTC AM TCT CCC TTC CTC TTC TCC 
I" Trp Ph. Pro Ph. Tyr Tyr M.t Uu ty. Cy. cly Ph. U» L.u Trp 
1JO 13* 1 * B 

-,_ ... ccc CC6 WC CCT TCT AAT CCC CCT CAA CTC CTC TAC AAC CCC 
IS & 15 Pro S.r A.n Cly XI. Clu Uu Uu Tyr ly. teg 

ATC ATC CCT CCT TTC TTC CTC AAC CAC CA6 TCC CAC ATC CAC ACT CTC 528 
K lie At, Pro Ph. Ph. Uu Ly. Hi. Clu S.r Gin H.t A.p S.r V.l 

J65 1?0 l'S 

CTC AAC CAC CTT AAA CAC AAC TCC AAA CAC ACT CCA CAT CCC ATC ACT 576 
V.l ty. A.p Uu ly. A.p Ly. S.r Ly. Clu Thr Al* A.p Alt lie Thr 
J80 1,0 

AAA CAA CCC AAC AAA CCT ACC CTC AAT TTA CTC CCT CAA CAA AAC AAC 62 « 

Ly. Clu AL Ly. ty. Al. Thr V*l A.n Uu Uu Cly Clu Clu ly. Ly. 
19$ 200 205 



ACC ACC TAAACCAGAC TAAACCACAC TCCATCCAAA CTTCCTCCCC TCTCTGTACC 


660 


Ser Thr 














210 














T7CCTACTGG 


ACCTTCATCT 


TATAT7AGCC 


ACTCTGGTAT 


AATTATTTTA 


ATAATCTTGC 


740 


C7TGGAAACA 


TTTTTCAGAT 


AT7AAAGAT7 


GGAXTGTGTT 


C7AAGT7TC7 


TTGCTTACTT 


800 


TTACTGTCTA 


TATATATACC 


CACCACTTTA 


AAC7TAA7GC 


AGTGGGCAGT 


GTCCAC6TTT 


860 


TTCCAAAATG 


TATTTTCCCT 


C7GGG7AGCA 


AAASA7G7AT 


CTTCCTATCC 


TCCAGGAAAT 


920 


ATAAACTTAA 


AATAAAATTA 


TATACCCCAC 


AGGCTG7G7A 


C777ACTGGC 


CTCTCCCTCC 


960 


AC5SATTTTC 


TCTGTACTTA 


CATTTAGCRT 


AA7C77TATG 


C77CTACTTC 


CTKTAATGTA 


1040 


CAATTTTATA 


TAATTCWrtFA 


ATGTTTTTAA 


TC7AT77C7C 


CACATCTACA 


TATGGAAATG 


1100 


TTACTGTCTG 


ACTACAXCAT 


GCATCATGCT 


GATGGGCACG 


6ACCAGCCCA 


ACCTTGTATC 


1160 


TCTCATTTAT 


AACTTCTCTA 


CAGTAACACC 


ACCTCCCAAA 


ACCTGGAGGA 


ACCATTCTGC 


1220 


TCCTCTCCTC 


TACTAAATAA 


TACTTTACCA 


AA7ACGTGAT 


TAATATGCAA 


GTGAACAAAC 


1260 


TCAGAAATCA 


AATCGAATGG 


AGATTGGCC7 


CC7TCT7TCC 


STAC TAT ATC 


C CAT ATC AAT 


1340 


ACCAGGATAC 


CTTTATAAAC 


CAG7TAGTTA 


CT7AGTTAC7 


CAC7C7AGTG 


ATAAATCGGG 


1400 


AAATTTACAC 


ACACACACAC! 


ACACACACAC 


ACACACACAC 


ACACACACAC 


ACACACACAC 


XAiZ 
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AGTACCCTGT XACTCTCXXT TCCCTCAAAA ACTAGTAATA CTGTCTTATC 
TTTACATAtT TGTCTATTCT CAAGATCCTA CAKTCCAH.C CATTTCTGGT 
^SGGAGA* ACATCTTCAT TTAGTCTTCT TICCCAATCt ICTTTTTTAA 
OCKKCTTCTC KAGAITTOJfC CACCTCTGAT TACATCTATC ITCTTGTTT6 
AACAACATGC TAATGKCGAC ACCTAGCTCT *RAGMGCAA7T CTGGGACA*T 
ATARA6THNC CCATAATCTG CTTGCCAATA CITAACTCAA TCTATCITCA 
CCCCTTTAAG ©TCAAACACA ACACCCTTCC CTACTTTACA AGTCAGAGTC 
CATTtAAAXO CCCTCATCCG TATTCTTXGT GTTGATAAGC TGCACAXCAC 
GTACAGANCA CTAAACTTAA HNCGGATCTC ICCAT«ATC TGCCAANTCC 
CAATTWTCT OGACTACAAA ATCT6AG7TT TACACCATAC TCTTAAGACT 
TAAACTAGAC TAAAACAACT GTATAACTAA ACTAACAACA TTAAATATCC 
CTATTTTXTA AGGCAAATAA ACATCAITAG CTCACCTTGA CWAACAAIC 
ATKACAATGT CTCATCATCT NAAMAATATT AAACATATCA AIACTAAGTG 
MMCTAATATA ATAtGCATCA GAGCATTTAT TttCGCCAGC AAAACAGTGG 
CATTTTATTA AACtlAAAAC TTTGTAGAAA GCAAACAAAA TTGTTCTTGG 
ACtTTTAGAT TAAAAAAATT TTAAGTAWCT AGGACTATtt AAATCCTTIT 
AAAGTACAOT tlTCTTCCTG GCAGAATGAA AATCA6CAAC KTCTAGCATA 
AATCAGATTG ACAGCAXAIA GAATATATTA tCAGACAAGA TGAGCAGGTA 
TATTGCXCAT AATCACTTAC AGCCXAAAAH TAGMTKTAAA ATACIA7AII 
TGCAAtTtTT TTTTCTTCCC TtGAGACCAA AAXTTAAGTT AACTCTTGCT 
GTGTAAATGT tAACACCAGG AGAAGTTAAG AAITGAGCAG TTCTGTTGCA 
AATCAAAXAC TGCCTWGCI ACACTTTGAA AAACtAATTG AGCCTCTGCC 
KCAA60CTTT ATTTOAATGT GAATAOTGTT TCAAAGGTAT GIAGITACAC 
AAACAGCTTA AATICTtCAA GAAAGAAKC CICCAGCAGT TATTCCCTTA 
TCAATCATTT GGATCAACAA C1CCTACTCT C66GAA6AC" CCTCtACTCA 
AAATGAGCAC ACCCTTCACA CTGtTATCAC CtATCCIGAA CATCTGAIAC 
AATAAATAGA TGTAAAIAAA ATTGAGWTCT CAT7TAAAAA AAACCATGTG 
AAATGACCTC ATGTTCTCC7 TTAAACAGCA AC7GCACC=A CT*GC*CAG= 
AUCCTATATA TACATCICIG TCAGTCCCCC TC 

(2) JHrOWttTIOH FOR WG 10 NO: 4: 



TGCTATAAAC 




T7TATCTTCA 


1S60 


KCCACTTTNA 


1640 


TATCATKAGC 


1700 


CAWCCKWCT 


l/OU 


CTTTTTCTCT 


1620 


ACTTGTACTC 


i860 


TACATACTAA 


1940 


KTATACACAC 


2000 


CCTTTTCAAT 


2060 


A6CCA0TACA 


2120 


ACCTAA6ATC 


2180 


ACAGTA7CAC 


2240 


TOATTXCCCC 


2300 


CASAAAATCA 


2360 


CCCATAAATA 


2420 


TA6AC7ATAT 


2480 


CAAAAC7TAC 


2540 


AAATTCTGAA 


2600 


GCCAG7CTAA 


2660 


TGA777CCCA 


2720 


TGGCTAGAAA 


2760 


AATTCCTACC 


2840 


CCTGAAGGC7 


2900 


CAGCTGAAGA 


2960 


ACTGAA7GGA 


3020 


CCCAATCCCA 


3060 


CCATTCACCT 


3140 




3172 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 210 amino acida 

(B) TYPE: amino acid 
(D) TOPOLOGYt linear 

(ii) MOLECULE TYPE; protein 

(xi) SEQUENCE DESCRIPTION': SEQ ID NO: 4: 

Ala Val Ala AU Pro Val Tyr Pro Ala Leu Cly Thr AU Pro Cly Ciy 
15 10 15 

Clu Thr Val Pro Ala Met Ser Ala Ala Met Arc Clu Arg Phe Aap Ar9 
20 25 20 

Phe Leu Hie Clu Lye Aen eye Met Thr Aap Leu Leu Ala Lye Leu Clu 
35 40 45 

Ala Lye Thr Cly Val Aen Arg Ser Phe He Ala Leu Ciy Val He Cly 
50 55 60 

Leu val Ala Leu Tyr Leu val Phe cly Tyr Cly Ala Ser Leu Leu Cys 

65 70 75 tO 

Aan Leu lie Cly Phe Cly Tyr Pro Ala Tyr tie Ser He Lya Ala He 
85 90 95 

Glu Ser Pro Aen Lye Glu Aap Aep Thr Cln Trp Leu Thr Tyr Trp Val 

100 105 110 

Val Tyr Cly Val Pha Ser XI* Ala Clu Ph. Pha Sar Aap He Phe Leu 

115 120 125 

Ser Trp Phe Pro Phe Tyr Tyr Met Leu Lya Cye Cly Phe Leu Lau Trp 
130 135 140 

Cys Met Ala Pro Ser Pro Ser Aen Cly Ala Clu Leu Leu Tyr Lye Arg 

245 ISO 155 160 

Ha He Arg Pro Pha Phe Leu Lya Hit CIj Sar Cln Met Aap Sar Val 

165 170 175 

Val Lye Aep Leu Lya Aap Lya Ser Lya Clu Thr Ala Aap Ala lie Thr 
180 185 190 

Lye Clu Ala Lye Lye Ala Thr Val Aen Leu Leu Cly Clu Clu Lya Lya 

195 200 205 

Ser Thr 

210 

(2) INFORMATION TOR SEQ ID NOsSi 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH t 434 amino aeidt 

(B) TYPE: amino acid 

(C) STRANDSOKESSi a ingle 
(0) TOPOLOGY t linear 

(ii) MOLECULE TYPE: protein 
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(vi) ORIGINAL iOTOCti 

' (A) ORGANISM: Haw ..pi." 

<vii) IMMEDIATE SOURCE: 
' (B) CLONE: TBI 

(xi) SEQUENCE DESCRIPTION: SEC ID »0:Si 

V.l AU Pro v.l v.l v.l Oly Ser Oly Arc Ai. Pro Ar 9 Hi. Pro Ala 

Pro Ai* AU Net Hi. Pro Are Arg Pro A.p oly Ph. A.p Oly Lou Cly 

20 28 
Tyr Axfl Cly Gly Al. Arc A. P Clu oln Oly Ph. Oly cly AU Ph. Pro 



35 



AU Arg S.r Ph. ser Thr Cly S.r A.p Leu cly Hi. Trp V.l Thr thr 

50 55 
Pro Pro A.p 11. Pro Oly s.r Aro A.n Leu Hi. Trp Oly Clu Ly. Ser 

65 70 

pro Pro Tyr Oly v.l Pro Thr Thr s.r Thr Pro Tyr Clu Cly Pro Thr 



85 



Ola Olu Pro Ph. S.r s.r Cly Oly Cly Cly Ser V.: Cln Cly Cln Ser 

s«r Clu Oln f« A.n Are Ph- Al. Cly Ph. Cly 11. Cly t.u Al. S.r 

H$ 120 

L«u Phe Thr Ciu A.n v.l Uu AU Hi. Pro Cy. IU v.l Leu Ar 9 Arg 

Cln Cy. Cln V.l A.n Tyr Hi. AU CU Hi. Tyr Hi. Leu Thr Pro Ph. 

ISO 

Thr V.l II. A.« IU H«t Tyr S.r Phe A.n Ly. Thr Cln Cly Pro Ar 9 

165 no i,a 

AU L.u Trp Ly. Cly Met Cly S.r Thr Phe II. V.l Cln Oly V.l thr 

180 lyo 

L.u Oly AU Clu Oly II. 2U Ser Clu Ph. Thr Pro Leu Pro Arc Olu 

195 200 

v.l Leu Hi- ty Trp S.r Pro Ly. Oln II. Oly Olu Hi. Leu Leu Leu 

210 21S " 

Lye ««r L.« Thr Tyr V.l V.l AU M.t Pro Ph. Tyr S.r AU S.r leu 

225 230 *** 

XI. Clu Thr V.l Oln Ser Clu lie He Arg A.p A.n Thr Cly IU Leu 

245 2 SO «« 

Clu Cy. V*l Ly. Olu Cly 11. Cly Arc v.l IU Gly H.t Cly v.l pro 

260 265 
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Hle Ser Lyt Arg Leu Leu Fro Leu Uu Ser Leu lie Phe Pro Thr V4l 

J75 280 285 

Leu Hit Cly Val Leu Hie Tyr He lie Ser Ser Vtl lie Gin Lyt Phe 

290 295 300- 

vel Leu Leu lie Leu Lye Arg Lyt Thr Tyr Aan Ser Hit Leu Ale Clu 

30S 310 315 320 

Ser Thr Ser Pro V4l Gin Ser Met Leu Aip Ale Tyr Phe Pro Clu Leu 
325 330 335 

He Ala Aen Phe Ale Ala Ser Leu Cyt Ser Asp Vel He Leu Tyr Pro 

340 345 350 

Leu Clu Thr Vel Leu Hie Arc Leu His He Gin Cly Thr Arg Thr He 
355 360 365 

He Aap Aen Thr Aep Leu Cly Tyr Clu Vel Leu Pro He Aen Thr Gin 
370 375 3S0 

Tvr Glu Cly Met Arg Aep Cye He Aen Thr He Arg Gin Glu Clu Cly 
38$ 390 395 400 

vel Phe Cly Phe Tyr Lye Cly Phe Ciy Ala Val He lie Gin Tyr Thr 
405 610 415 

Leu Hie Ala Ala Vel Leu Cln He Thr Lye He He Tyr ser Thr Leu 

420 425 430 

Leu Gin 

(2) INFORMATION fOR ESQ IP HOt4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 165 emino *cid« 
(5) TYPE: eaino eeid 

(C) STRANDEDNSSS 1 tingle 

(D) TOPOLOGY: linear 

(ii) KOLECULE TVPEi protein 

(vi) ORIGINAL SOURCES 

(A) ORGANISM 1 Homo eepienf 

(vii) IKKEOIATE SOURCE; 

(B) CLONEi YS-39(T82> 



(xi) SEQUENCE DESCRIPTION: SEQ IC NO: 6: 

Clu Leu Arg Arg Phe Aep Arg Phe Leu Hit Glu Lyt Aen Cye Met Thr 

1 5 10 15 

Aep Leu Leu Ale Lyt Leu Clu All Lyt Thr Cly Val Aen Arg Ser Phe 
20 25 30 

lie Ala Leu Cly Vel He Gly Leu Val Ala Leu Tyr Leu Val Phe Gly 
35 40 45 
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ryr Cly M.* S.r L.» L.u Cy. A.n t« II. Cly Ph. «y Tyr Fro Al. 

SO 55 
Tyr II. ST II. ty. Hi. H. «« «.r Fro A.n ty. Clu A.p A.p Thr 



Cln Trp L.« Thr Tyr Trp V.l V.l Tyr Cly v.l ,h. S.r XI. U. Cl« 

8S * u 
Ph. Ph. S.r A.p Zl. Ph. tat S«r Trp Ph. Pro Ph. Tyr Tyr He Leu 

iy. cy. Cly Ph. t.u U« Trp Cy. M.t Al. Pro l« Pro Set A.n Cly 

120 

Al* Cl« f« fu Tyr ty. Are II. XI. Aro Pro Ph. Ph. In ty. Mi. 

130 I 35 



Clu S.r Cln Mt A.p S.r v*l v»l ty. A- P teu ty. A.p ty. Al. ty. 

^45 ISO 

Clu Thr Alt A.p Al. II. Thr ty. Clu kU ty. ty. AL Thr Val A.n 

X65 170 15 



Leu Leu Cly Clu Glu Lyt Lyt Str Thr 
180 I' 5 

(2) INFORMATION FOR SEQ 10 WO!?: 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH t 2843 amino tcid« 
(I) TTPEt eaino acid 
(C) STRANSEDKESS t •Utqlm 
(0) TOPOIrOCti lin««r 

(ii) MOLECULE TTPEf protein 

(Vi) ORICIHAL SOURCE: 

(A) ORGANISM: Homo e»pi«ni 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: AJ»C 



(xi) SEQUENCE DESCRIPTION: SEQ I© NOtV. 

Met Ale Alt Al* Ser Tyr Atp Cln Leu Leu Lyt Cln Vti Clu Alt Leu 

1 5 10 15 

Lye Met Clu Aen Ser Atn L«u Arg Cln Clu Leu Clu Aep Atn Ser Atn 

20 2$ 30 

His Leu Thr Lyf L«u Clu Thr Clu Ale Ser Atn Net Ly« Clu Vel Leu 
35 40 *5 

Lye Cln Leu Oln Cly Ser lie Clu Atp Clu Alt Met Alt Ser Ser Cly 
J so 55 60 

Oln lit Asp Leu Leu Clu Arg Leu vyi Clu Leu Atn Leu A§p Ser Ser 
« 70 75 80 
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*.n Ph. Pro cly v.l Ly. Uu Xr 9 *« ty. M.t s.r U« Ar 9 S.r Tyr 
85 *° " 

Cly s.r Ar* Clu Cly St v»1 S.r S.r Aro S.r Cly Ola Cy. S.r Pro 

v.l Pro M.t Cly s.r Ph. Pre Arg Ar 9 Cly Ph. w.l M» Cly S.r Arg 

jj5 120 i« 

Clu S.r Thr Cly Tyr Uu Clu Clu Uu Clu Ly$ Clu Ar 9 S.r L.u Uu 
130 »5 "0 

Uu AL A. P Uu x. P Ly. Clu Clu fcy. Clu ly; A.p Trp Tyr Tyr Al« 
145 ISO * ss 

Cln Wu Cln Am Uu Thr iy. Aro XI. kmp S.r Uu t.u Thr Clu A.n 

170 175 

Phe Ser Leu Cln Thr Atp Mtt Thr Arg Arg Cln Leu CXu Tyr clu AU 
100 i9 ° 

Arg Cln lie Arg Vei AX* Met Clu Clu Cln Leu Cly Thr Cyt Cln Atp 
* 19$ 200 205 

Met Clu Lyt Arg AU Cln Arg Arg He Alt Arg Il€ Cln Cln He Clu 
210 21S 220 

Lyt Atp lit Leu Arg lie Arg Cln Leu Leu Cln Ser Cln AU Thr Olu 
226 230 235 240 

Alt Clu Arg Ser Ser Cln Am Lyt Kit Clu Thr Cly Ser Hit Atp AU 
245 250 255 

Clu Arg Cln Am Clu Cly Cln Cly Vtl Cly Clu lie Am Met Alt Thr 
260 265 270 

Ser Cly Am Cly Cln Cly Str Thr Thr Arg Met Aep HU Clu Thr AU 
275 280 215 

Ser Vtl Leu Str Ser ser str Thr Hit Ser Alt Pro Arg Arg Leu Thr 

290 29S 300 

Ser Hit Leu Cly Thr Lyt VaI Clu Mtt Vtl Tyr Scr Leu Leu ser Met 

305 310 315 320 

Leu Cly Thr Hit Atp Lyt Atp Atp Met Ser Arg Thr Leu Leu Alt Met 

325 330 335 

Ser ser Ser Cln Atp Ser Cyt lie Str Mtt Arg Cln ser Cly Cyt Leu 
340 345 350 

Pro Leu Leu He Cln Leu Leu Hit Ciy Am Atp Lyt Atp Sec v*l Leu 
355 360 365 

Leu Cly Atn Ser Arg Oly Ser Lyt Clu Alt Arg Alt Arg Alt Ser Alt 

370 375 3SO 

Alt Leu Hit Atn lit He Hit Str Cln Pro Atp Atp Lyt Arg Cly Arg 
385 390 395 400 

Arg Clu He Arg Vtl Leu Hit Ltu Leu Clu Cln He Arg Alt Tyr Cye 
405 410 415 



WOW/13103 PCT/VS92/00376 



«« Thr Cy. Trp Cl» Trp Gin 01. JU M. Clu Pro «, Met *-P Cln 
Mp Ly. A.n Pro H.t Pro Al. J" v.l Olu Hi. 01. , XI. Cy. Pro .1. 



435 

val cy. v.l L.u «.t if «« «- CiU 3S Hl ' Xr ' Mi ' Al * 

450 45& 

jg m. oi- x- «y ;iy t« «■ »■ $ clu uu L,u cln SJ 

cy. 01» M.t Tyr Cly U- Thr A.n A.p Ki. Tyr Ser 11. Thr Leu 
xr, Arg tyr Al. cly H.t Al. l« Thr A.n L.u Thr Ph. Cly A. P v.l 
Al. A.n Ly. Al. Thr Leu Cy. S.r H.t Ly. Cly Cy. Met Arc AL Leu 

v.i ai. cm l.u H» s " citt A,p ^ gS ata V4i U * M * 

v.1 Leu A*, A.n Leu S.r Trp Arg AU J. f v.l A.n S.r Ly. Ly. 

Thr Leu Ar, Clu V.1 Oly S.r v.l Ly. Jl. t.u M.t Olu Cy. Al. Leu 

565 

01- V.1 Ly. 1*. Gl« ™ c L * u K $ * r V41 l * U $<r JJS ^ TrP 

550 

A.n Leu S«r Al. Hi. Cy. Thr Clu *.n Ly. Al. A.p II. Cy. Al. v.l 

ggg 600 wwa 

A.p Cly Al. Leu XI. Ph. Leu v.l Oly Thr Leu Thr Tyr Arg S.r Oln 

610 615 

Thr A.n Thr L.u Al. II. XI. Clu Ser Cly Cly Cly II. Leu Arg A.n 

625 

v.1 Ser S.r L.u II. Al. Thr A.n Clu A.p Hi. Arg Cln lie U. Arg 

64 5 

Clu A.n A.n Cy. Leu Cln Thr Leu Leu Cln Ki. Leu Ly. S.r Hi. Ser 

Cy. Cly Thr L.u Trp A.n 
675 « 80 MS 



660 ««5 
uu Thr II. V*l S.r A.n Al. Cy. Cly Thr L.u Trp A.n Leu Ser Al. 



Arg A.n ttz Ly. A.p cln elu Al. Leu Trp A.p Hit Cly Al. Vii Ser 



690 



mt Leu Ly. A.n L.« n. Ki. S.r Ly. Hi. Ly. Mt He Al. Met Cly 

705 710 

s.r Al. Al. Al. Leu Arg A.n L.u Met Al. A.n Arg Pro Al. Ly. Tyr 

730 

ty Asp AU A.n II. H.t Ser Pro Cly s.r s.r Le« Pro ser Leu Hi. 
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V.l Arg Ly. «n Ly. Al. L.u Clu Al. Clu L« A.p Al. Cln Hi. L.u 

75S 7W 

S.r fllu Thr Fh. A.p A.n XI. A.p *« >» «*■ A1 * $ « " l> 

775 /Bg - 



770 



Ar, Ser Ly. Cln Axg Hi. Ly. 61. Ser L.u Tyr Cly A.p Tyr v*l Pge 
785 750 

A.p Thr A.n Ar, Hi. A.p A.p A.n Arg S.r A.p A.n Ph. A.n Thr Cly 

A.n M.t Thr v.i L.« S.r Pre Tyr L.u A.n Thr Thr V.l Uu Pre S.r 

$20 825 

S.r S.r S.r S.r Arg Cly S.r Leu A.p S.r S.r Arg Ser Clu Ly. A.p 
$35 840 •«> 

Arg S.r L.u Clu Arg Clu Xrg Cly II. Cly L.u Cly Ait Tyr Hi. Pro 

eso ess «° 

Alt Thr Clu A.n Pro Cly Thr S.r S.r Ly. Arg Cly L«u Cln lit S.r 
S65 870 

Thr Thr Alt Alt Cln II. Alt Ly. V.I M.t Clu Clu V.I S.r Alt XI. 

065 890 895 

Hit Thr s.r cln clu A.p Arg Ser S.r Cly S.r Thr Thr Clu L.u Ml. 

900 90S 910 

cy. v.l Thr A.p Olu Arg A.n Al. L«u Arg Arg S.r S.r Alt Alt Hi. 
915 920 925 

Thr Hi. s.r Am Thr Tyr A.n Ph. Thr Lyt S.r Clu A.n s.r Atn Arg 
930 935 940 

Thr Cy. S«r Met Pro Tyr Alt Ly. L.u Clu Tyr Ly. Arg S.r S.r A.n 
945 950 9SS 960 

a.p s.r L.u A.n S.r Vtl Str s.r S.r A»p Cly Tyr cly Ly. Arg cly 
9(5 970 975 

Cln Met Ly. Pro S.r II. Clu S.r Tyr Ser Clu A.p A.p Olu S.r Ly. 
930 985 990 

Ph. cy. S.r Tyr Cly cln Tyr Pre Al. A.p L»u AL Hi. Ly. II. Hi. 
* 995 1000 1005 

S.r Al. A.n Hi. M.t Atp Atp A.n A.p Cly Clu L.u Atp Thr Pro He 
1010 101S 1020 

Atn Tyr S.r L.u Ly. Tyr s.r A.p Clu Cln L.u A.n S.r Cly Arg Cln 

1025 1030 103S 1040 

S.r Pro Ser Cln A.n Clu Arg Trp Al* Arg Pro Ly. Hi. II. II. Clu 
104S 1050 10S5 

A.P Clu II. Ly. Cln Ser Clu Cln Arg Oln S.r Arg A.n Cln S.r Thr 
r io.O 1065 1070 

Thr Tyr Pro v.l Tyr Thr clu s«r Thr Atp Atp Ly. Hit L.u Lyt Phe 
1075 1080 10.5 
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cl . elu cy, v*l st Pro Tyr Xr* *«r xrg 
Gin Pre Hit Ph. Cly Gin Cln Clu cy« n00 

1090 

~* m** i.« Arc Vtl Cly Ser Xtn Hii Cly He 
Cly tit Mn Cly Ser Clu Thr A.n Ar* ^ n20 

1105 1110 

M «. ». « s «» - - «• ti;. c,u " p n ' vs." 



-s-- Sfir ciu Clu Clu Cln His 
Xip Ly. Pro Thr A.n Tyr S«r Clu Arg Tyr ser ^ 
1X40 

. lB .1. Clu ciu Ar, Pro Thr A.n Tyr S.r II. Ly. Tyr^.n Clu Clu 

1135 1160 
Ly . xr 9 ML. v.l A.p CX. Pro IX. A.p Tyr ..r gj*. Tyr AX. Thr 

1170 1175 
Ai p XX. Pro s.r ,.r Cln Ly. CXn s.r Ph. S.^Phe S.r Ly. S.r 

lies li9 ° 

s . r cl y cln s.r S.r*. Thr clu Hi. S« s.r S.r CUA.n 

Thr s.r Thr Pro S.r S.r A.n AU Ly. Arg Cln A.n CXn gMU. Pro 



1220 



s„ s.r AX* Cln S.r Ar, S.r Cly^Cln Pro Cln ly. AUAU Thr Cy. 



1235 



ly V.l S.r S.r Il« A.n Cln Clu Thr II. CXn grtjr* Cy. v.l CXu 

1250 1255 

*.p Thr Pro U« Cy. Ph. Ser Arg Cy. S.r j.r — s.r Ser L.« Ser^ 

126S 1270 

s.r AU Clu A.p CXuU. Cly Cy. A.n CUThr Thr Cln Clu AU A. P 

ser AU A.n Thr l.u Cln lie AU Clu II. ly «■ Ly. lively Thr 

1300 AJW * 
xr 9 S.r JU CI. A.p Pro VI Mrtt. *i Pro AU v.l^.r CXn Ru 

pro Aro Thr ly. S« S.r Arc L.« CXn CXy S.r S.r L.u Ser S.r Ciu 

1330 1335 
S« Al. AT- HI. Ly. AU Q V.l Clu Ph. S.r grCly AU ly. Ser 

ser Ly. S.r Cly AU Cln Thr Pro Ly. grPr. Pro Clu Hi. Ty^val 



Gin 



Clu Thr tw 1*. ««t Phe S.r g^e*. Thr S.r V.l grit* l.u 



13S0 



Asp S.r Phe.Clu S.r Ar, Ser Xl.*. S.r s.r V.X Cinder Clu Pro 



13»5 



Cy. Ser Cly M« v.l Ser Cly XX. XI. $ .r Pre S.r^A.p L.u Pro A.p 



1410 »«» 
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Ser Pre Ciy Cln Thr wt »r. Fro S.r *r, Jjr «.y. T*r Fro Fro jw 

1425 14J0 

Pro Pro Olft Thr Al. Cln Thr Lyi Xro clival Pro Ly. Aan Lya^Ala 



1445 



Pro Thr Ala Clu Lye Arg Olu Ser Cly Pro Ly. Cln Alt Al. Val Aan 
1460 * 4 * 5 i4?w 

Ala Ala val Cln Arg Val Cln Val Leu Pro A.p Ala Mp Thr Leu Leu 
1475 1480 1485 

His Phe Ala Thr clu Sar Thr Pro Aap Cly Pha Ser Cya sar S«r Sar 
1490 149$ 1500 

Leu Sar Ala Lau Ser Law Aap Clu Pro Pha lie Cln Lya Aap Val Clu 

1505 1510 151S 1520 

Leu Arc lie Mat Pro Pro Val Cln Clu Aan Aip Aan Cly Aan Clu Thr 
1525 1530 1535 

Clu Sar Clu Cln Pro Lya Clu Sar Aan Clu Aan Cln Clu Lya Clu Ala 
1540 1545 1550 

Clu Lyi Thr lie Aap Sar Clu Lya Aap Leu Leu Aap Aap Sar Aap Aap 

1 1555 1560 1565 

Aap Amp tie Clu lie Leu Clu Clu Cya lie lie Ser Ala Met Pro Thr 
* 1570 I*** 1S80 

Lya Sar Sar Arg Lya Ala Lyi Lyi Pro Ala Cln Thr Ala Ser Lya Leu 
1585 1S9C 1595 1600 

Pro Pro Pro Val Ala Arg Lya Pro Ser Cln Leu Pro Val Tyr Lya Leu 
1605 1*1° 1*15 

Leu Pro Ser Cln Aan Arg Leu Cln Pro Cln Lyi Hii Val Ser Phe Thr 
1620 1*25 1630 

Pro Cly Aap Aap Met Pro Arg val Tyr Cya Val Clu Cly Thr Pro lie 
1635 1640 1645 

Aan Phe Ser Thr Ala Thr Ser Leu Ser Aap Leu Thr He clu ser Pro 
1650 1655 1660 

Pro Aan Clu Leu Ala Ala Cly Clu Cly Val Arg Cly Cly Ala Cln Ser 
1665 1670 1575 1680 

Cly Clu Phe Clr Lya Arg Aap Thr lie Pro Thr clu Cly Arg ser Thr 
1685 1690 1695 

Asp Clu Ala Cln Cly Cly Lya Thr Ser Ser Val Thr Ha Pro Clu Leu 
1700 1705 "1710 

Aap Aap Aan Lya Ala Olu Clu Cly Aap Ha Leu Ala Clu Cya He Aan 

1715 1720 172S 

Ser Ala Met Pro Lya Cly Lya Ser His Lya Pro Phe Arg Val Lya Lya 
1730 1735 1740 

lie Met Asp Cln Val Cln Cln Ala Sar Ala Sar s*r Ser Ala Pro Aan 
1745 1750 1755 1760 
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, r . A.n cm t.u } .p s ciy nr. tor- g- 0 '» "° ^ 

. . c\ u rvr Arc Thr at? v«i Are Ly. A." AX. 

Pre U* Pro J 1 , n 0 A,n Thr C ^ 07«$ ~ X790 

».« »i« Clu Arg V.l Ph. S.r A.p A.n Ly. 
Asp S.r Ly. A.n A.n L.u A.n Alt CW ax 5 ^ 

179S iBU 

. i.n L«u Ly. A.n A.n s.r Ly. A.p Ph. A.n At? 

Aip s.r Ly. Ly. 61b A.n u« iyi iM0 

1810 1819 
Ly . lcu Pro A.n A.« .1. ft., Arc v.l at, ciy^.r Ph. AX. Ph. Jft 

182S 1830 

s .r Pre Hi. Hi. tyr Thr Pre IX. «Xu clyjhr Pre Tyr Cy. Ph. $ s.r 

1845 

Arc A.n A.p S.r o L.u S.r Ser L~ A.p $ Ph. A-P ft* A. P A.pVtl A.p 

,.u ser Are "! Ly. AX. clu g. ft* ^ L y . C^A.n Ly. Clu 

1875 1880 

S.r Clu AU Ly. V.1 Thr S.r Hi. Thr clu Thr*r A.n Cln CXn 

1890 1895 

Aim XU AU Lvf Cln Pro Il« Arg Clr 
Str Alt Atn ty Thr Cln Aim xi« ai» *-r 192C 

1905 1910 

OX. Pro Ly. Pre XX. U- J« * h < ' b « " B 



ty. A.p II. Pro A.p Are Cly ftU AX. Thr A.p CX« Ly. L-u^CXn ftn 
1940 



Ph. AX. IX. Clu ft.. Thr Pre V.l Cy. ft. S.r Hi. A.^S.r S.r L.u 

1955 1?6C 



5 .r Ser u. S.r A.p IX. M ? «. Clu A.n A.n ft.^Ly- Clu A.n Clu 



1970 



Pre II. Ly. Clu Thr ? Xu Pre Pre ft. P s.r CX^Oly CXu Pro S.r Ly^ 

1985 19?0 

Pre exn AX. S.r Cly^r Alt Pro Ly. ««. Hi. v.l CXu A^Thr 
Pro v.l Cy. Ph. S« *r 9 A.n S.r S.r :..u Ser S.r Urn S.^IX. A.p 



2020 



«.r CX« A.P.A.P Uu L.u CXn CXu*. IX. S.r S.r ftU H.t Pro Ly. 



2035 



Ly. Ly. Ly. Pre Ser Arg Leu Ly. Cly A.p A.n Ol^Ly. Hi. S.r Pre 

2050 20 - a 



Arc A.n M Cly Cly IX. L.u Cly CXu A.p U« thr Leu A.p L.u Ly^ 

206S 2070 

A.p IX. CXn Ar 9 Prc^p S.r cl« -L. OXy*- S.r Pre A.p gr «. 
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A.. Ph. MP trpty. AU II. 61. Clu «y AL A.n S.r ZUV.l S.r 



u <* cin iu Alt Alt AU Alt cyt t«u Str Arg 
S«r L«u HL^Clft Alt Ait £* 20 _ 2 12S 



2100 

AU Alt cyt t«u Str Arg Alt s«r 

2115 ' " 

S.r Afp S.r A.p *.r II. U; Mr L.« LP Str Cly II. S.r u« Gly 

2X30 2135 
st Pre Ph. Mi. Uu Thr Pre A.p «ln Clu Clu Ly. Pre Ph. Thr S.^ 

2145 2150 «*« 

A.n Ly. Cly Pre Argil. U« **■ P« «« «« J", 61 " 

Thr Ly. Ly. Il«,clu »«r Clu s.r ly-My »• ty- «y JJ^y 



2180 
J 

219S 



V»l Tyr Ly.s.r Uu U. Thr Clyly. v.l Are s.r A.^Sor Clu II. 



s.r Cly Gin Met Ly. Cln Pre fcw cm AH A.n Mt Pro S.r 11. s.r 

2210 2215 2220 

Arg Oly Arg Thr >ut mHli XI. Pro Cly V.^Arg A.n S.r S.r S.r^ 

S.r Thr S.r Pro V.l Str Ly. Ly. Cly Pro Pre L.u Ly. Thr Pro Al. 

2J45 22SC 22S> 

S.r Ly. S«r Pro s.r Clu Cly Cln Thr Thr ** r Jf^** 9 Ciy 

Al. ly« Pro Mr Vil Ly. Ser Clu L.u Ser Pro V.l AU Arg Cln Thr 
2375 2280 32SS 

S.r cln II. Cly Cly S.r gr^y* * u Fro *« r }fg 0 Ser Cly ** r *** 
A.p Ser Thr Pro s.r Arg^Pre Al. Cln Cln Pro^Leu Ser Arg Pro I1. q 
Cln S.r Pro Cly Arg^n S.r Il« S.r Pr^Cly Arg A.n Ciy 11.^..- 

Pro Pro A.n Ly. L«u s.r Cln L«u Pro Arg Thr Scr S.r Pro S.r Thr 
2340 234S 2350 

AH S«r Thr Ly. S.r S.r Cly S.r Cly Ly. Met s.r Tyr Thr S.r Pro 
2355 23C0 2365 

Cly Arg Cln *».t S.r Cln |^ s A,n Uu Thr Ly * f JS 0 Thr ° Xy UU *** 

Ly.A.n Al. S.r s.r Il.^ro Arg S.r Clu S.: XU S.r Ly. Cly L.U q 

A.n Cln M« A.n A.n Cly A.n Cly Al. A.n Ly. Ly. V.l Clu U» S«r 

2405 2410 2415 

Arg M.t S.r S.r^Thr Ly. Ser S.r Cly^.r Ck Ser A.? Arg^r Clu 
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^ Pro v.l u. v.l Arg Cln S.r Thr Ph. U. ty. g «. '» 

2435 2 * 4 ^ 

- , ir« jltc Lvi Leu Clu Clu Ser AU Scr f h« Clu Ser L*u 

Pro Thr L«u Arg Xrg x*y» r':, 2460 

2450 2455 

ft .r Pro Ser Ser Arg Pro Al. Ser Pro Thr ArgS.r Cln Ala Cln f* 
3465 2470 

Pre V.l Uu »« Pro S.r Leu Pro A.p -et Ser Leu Ser Thr KUSer 



248S »«»° 
s „ v.l Cln Alt Cly cly Trp Arg Jy. L.u Pre Pro A.n L-^Ser Pro 

2500 * 9V 
Thr II. CUTyr A» *.P Cly Arg^Pro AX. Ly. Arc HUA.p II. AU 

Arg s« Hi. Ser Clu s.r Pro ser At, l.« Pre IU A.n Arg Ser Cly 

2S30 253 » 
thr^rp IV Arg Clu Hi.^.r Ly. Hi. s.r Ser S.r u« Pre Ar, 

Ser Thr Trp Arg Arg^Thr cly Ser S.r Ser^.r II. Leu Ser AU Ser 

S.r Clu S.r s.r^Clu Ly. AU Ly. S.r«u A.p clu Ly. Hi.^.l A.n 

S.r lie Ser Cly Thr Ly. Cl» S.r Ly. Clu A.n Cln V.l S.r AU Ly. 

2 £ 2 OUU * www 

Cly Thr Trp Ar S Ly. XI. A.n Clu Phe S.r Pre Thr A.n ».r 

2610 2615 
Thr s.r Cln Thr v.l Ser S.r Cly AU Thr A.n cly AU Clu ser g. 

2625 2630 

,hr L.u II. Tyr Oln.K.t AU Pro AU V.lS.r Ly. thr Clu A.pv.l 



264S 



Trp v.l Arg II. Clu A.p cy. Pro XI. Am A.n Pro Arg Ser Cly Arg 

2560 2**9 *w#v 

Ser Pre Thr Cly A.n Thr Pre Pro V.l II. A.p S.r v.l S.r Clu Ly. 

AU AM^ro A.« XI. Ly. A.| s S.r Ly. A.p A.n CXn^U Ly. Cln A.n 

V.1 Cly A.n Cly Ser V.l Pro he: Arg Thr v.l Cly L.u Clu A.n Arg^ 
270S 2710 
xan Ser Ph. 

2725 

I 

2740 



L.u A.n S.r Phe lU^Cln V.l A.p AU ProA.p Cln Ly. Oly Thr $ Clu 

II. Ly. Pro Cly Cln A.n A.n Pre v.l Pro v.l 5.r clu Thr A.n Clu 

2740 2145 d 9 

S .r Ser He V.l Clu Arg Thr Pro Phe S.r s.r s.r Server Ser Ly. 



27SS 
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Hl. S.r S.r Pro S.r Cly Thr V*l AL Mi Arc v»l Thr Pro Ph. A.n 

2770 3715 

T yr A.n Pre S.r Pro Are ly. S.r S.r AU A.p S.r Thr S.r AL Ar, 
J76S 2'W 2" s 2805 

Pre s.r cln II. Pro Thr Pro V.X A.r. A.n A.n Thr Ly. Arc A.p 

2805 2810 ^oi» 

S.r iy. Thr A.p S.r Thr Clu S.r S.r Cly Thr Gin S.r Pro Ly. Arg 

2820 2S2S "JO 

His S.r Cly ser Tyr Uu VaI Thr Ser VaI 
28iS "40 

(2) XNFORMXTION FOR SEQ ID NO: 8: 

(i) SEQUTNCE CHXRACTERISTICS: 

(X) LENGTH: 21 Amine ACide 
(1) TYPEi Amino Add 
(C) STRANDEDNESS: single 

(0) TOPOLOGY : lineer 

(ii) MOLECULE TYPE? peptide 

<vii) IMMEDIATE SOURCE: 

(B) CLONE: rAl2(yeest) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

Leu Thr Cly X1a Lye Cly Leu Cln Leu Xrg X1a Leu Xrg Xrg !!• xIa 
IS 10 15 

Xrg lie Clu Cln Cly Cly Tnr XIa I1a S«r Pro Thr S«r Pro Leu 
20 25 30 

(2) INFORMATION POR SEQ ID NO: 9: 

(1) SEQUENCE CHARACTERISTICS i 
(X) LENGTH t 29 Amino ACide 

(1) TYPE: Amino scid 

(C) STRXKDEDNESSi single 

(D) TOPOLOGY: lineer 

(ii) MOLECULE TYPE: peptide 

(Vi) ORIGINAL SOURCE: 

(X) OROXNXSM: He*0 SApiene 

(vii) IMKEDIXTE SOURCE: 

(B) CLOKE: o3 (wAChR) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOi9; 

Leu Tyr Trp Xrg He Tyr Lye Clu Thr Clu Lys Xrg Thr Lye Clu Leu 
1 S 10 IS 
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n„ xl. ely Thr CXu Mi eiu Thr elu 
iu Cly L«u Gln Xl " * 25 
20 

( 2| XHFORAATK* FOR MB »> -0:10* 

(01 TOPOtOCTi Xin«« 
(ii) MOLECULE WE; p«pti<J« 

( rti) IMMEDIATE SOURCE: 

1 (i) ctem: hcc 

SEQUENCE DESCRIPTION. SEQ 10 K0:10: 

. e *.„ u. au cl« CX« Arg S« Ar„ trp cu Ly. JJ« i- 

Leu Tyr Pro A»n x*u ai* ** ^ 15 

L. cly cltt 6lu * ,R clu S f ^ ^ XU M * 6 



(2) IKTORKATION FOR SZQ » NO. 11: 

(i> 7?) LENGTH i 40 b..e p»ir. 

(C STRANBEDNESS: «ln«le 
(O) TOPOLOGY: lin«»r 

(iil MOLECULE tYP«* eD M * 

,xi, SEQUENCE DESCRIPTION: SEQ 10 N0:11: 
CTXTCAAGAC TGTCRCTIT AATTCTAGTT TATCCXTTTI 
(2, INFORMATION TOR SEQ ID NO.X2: 

lii SEQUENCE CHARACTERISTIC* t 
' 1 5» LENGTH: 40 P**" 

tC> 5TRANBEDKES3 : iingX* 
(Pj toPOLOM* Xin«*r 

(It) MOLECULE TTPE: cDNX 

ivil ORIGINAL SOURCE: 



40 
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<xi) SEQUENCE DESCRIPTION: SIC 10 "0:12: 
TTTACAATTT CATGTTAATA TATTOTGTTC T7TTTAACAG 
(2) INFORMATION FOR SEQ 20 »*0:I3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 baa* pairs 

(B) TYPE: nuclalc acid 

(C) STRANDEONESS: singls 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: COMA 

(vi) ORIGINAL SOURCES 

1 (A) ORGANISM: Homo sapisnt 

(xi) SEQUENCE DESCRIPTION : SEQ ID HO: 13: 
GTACATTTTA AAAA6GTGTT TTAAAATAAT TTTTTAAGCT 
(2) INFORMATION FOR SEQ ID NO: 14 J 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 bats P*ixt 

(B) TYPt: nuelsie acid 

(C) STRANDEONESS: aincla 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo Stpisns 

(Xi) SEQUENCE DESCRIPTION! SEQ ID NO: 14 J 
AAGCAATTCT TGTATAAAAA CTTGTTTCTA TTTTATTTAG 
<2) INFORMATION POR SEQ ID N0:1S: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 40 bast pairs 

(B) TYPE: nuclaic acid 

(C) STRANDEONESS i sin*ls 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPEi CDNA 

(Vi) ORIGINAL SOURCEi 

(A) ORGANISM: Hotw sapisns 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CTAACTTTTC TTCATATAGT AAACATTGCC TTCTGTACTC 
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„, inroniwi" ro» mo » •>'»' 
,d! Topowar: linwr 

( U) MOLECULE TIM i eDMA 

SEQUXHCE BttWCW « 18 H0 «" ! 
HXNHNNNNNN HKNCTCCCTT TTTTTAAAAA AAAAAAATAC 

(j, mrowuTioM re* seq » ko:H» 

(11 SEQUENCE CBAWCWMSTXCS: 

B TTPI: nucleic Kid 
fC STFANBEDNESS: eingle 

jo) Topotocrt ii»«*r 

(ii) KOLEC0LE TTPEI COHA 

ivil ORICINAL SOCRCE: 

* (A) ORGANISM: He»o eapiene 

,xi, SEQOri.CE DESCRIPTION: SBQ IB "0:17: 
6TXACTAACT T«CA«T*CA ACTTATTTCA AACTTTAATA 
(2) IRFORKATION FOR S19 » 

(1) SEQOEHCE CHARACTERISTICS: 

11 (A) UCM6TH: 40 «>••• 

(B) TYPEs nuelele acid 
<C STRAHDEDNESS: Single 
(P) TOPOLOGY: 

(ii) MOLECULE CTPEi CDMA 

irl) ORIGINAL SOURCE: 

(X) ORGANISM: Hoao faplena 

(xi) SEQUENCE DESCRIPTION: SEQ » NOslS: 
ATACAAGATA TTCATACTTT TTTATTATTT CWGTTTTAO 
(2) INFORMATION FOR SSQ » MO: 19: 

ill SEQUENCE CHARACTERISTICS: 
( ' (A) LEHCTMi 40 baa* peire 

(B) TIN! nucleie acid 

(C) STRANDEDHESS: eingle 
(0) T0P0LOCXJ linear 



WO 92/13103 



PCT/US92/003 - '6 



Hi) MOLECULE TX9H cONA 

<vi) ORIGINAL SOURCES 

(A) ORGANISM: Homo Mpienf 

(Ki) SEQUENCE DESCRIPTION: SEQ 10 NO: 16: 
GTAAGTTAC? TGTTTCTAAC TGATAAAACA 6TGAAGAGCT «° 
(2) INFORMATION FOR SCO XP NO: 20: 

fil SEQUENCE CHARACTERISTICS! 

(A) LENGTH : 40 but pair* 

(B) TYPEi nueleie acid 

(C) STMKDEDKCSSt ttngl* 
<D) TOPOLOGY* Unoar 

{Li) MOLECULE TYPE: cOHA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM; Homo aapiaaa 

(Ki) SEQUENCE DESCRIPTION: SEO 10 NO:20: 
RATAAAAACA TAACTAATTA 06TTT C TTCT TTTATTTTAC <° 
(2) INFORMATION rOR SCO 10 NO:21i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 baa« pair* 

(B) TYRE: nucleic acid 

(C) ETRANDEDNESS : llngU 
(0) TOPOLOGY: linear 

{ii) MOLECULE TtPE: cONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo eapiene 

<xi) SEQUENCE DESCRIPTION: SEQ 10 NO:21: 
CTTAGTAAAT TSCCTTTTTT CTTTCTGGCT ATAAAAATAG <° 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 40 base pair* 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: eingle 

(D) TOPOLOOT: linear 

(ii) MOLECULE TYPE: CONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM i Homo eapienf 



40 



40 
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<*i) 5EQUENCE DESCRIPTION: SEQ ID N0.22, 

accatttttg catctactca tcttaactcc atcttaacag 

(2) INFORMATION FOR SEQ 10 NO: 23. 

li) SEQUENCE CHARACTERISTICS: 
' <A) UMCIH: 40 b..e p.iri 
B TTPE: nucltic aeld 
C STRANDEDNESS: 
(D) TOPOLOCTi Uno»r 

MOLEC0U TYPE" CONX 

fvi) ORIGINAt SOURCE* 
1 (X) ORGANISM: Homo upitnt 

(xi) SEQCENCE DESCRIPTIOH: SEO ID HOt23: 

etAAATAAAT TATTTTATCA TATTTTTTAA AATTATTTAA 

(2) INFORMATION FOR SEQ ID HO: 24: 

(i) SEQOENCE CHARACTERISTICS > 
1 ' it) LENGTH: 64 b»lt P»i.« 

(B) TKPti nuelaie *cid 

(C) STRAKDEDKESSi ain^l* 

(D) TOPOLOCYi lin««r 

(ii) HOIECUIE TTPE: cOHA 

(vi.) ORZCIHAL SOURCE: 
1 <A) ORCANISM: Homo nplani 

(xi) SEQUENCE DESCRIPTIOH: SEQ 13 NO:24: 
CATGATCTTA TCTCTATTTA CCTATACTCT AAATTATACC ATCTATAATG TCCTTAATTT 

TTAC 

(2) INFORMATION FOR SEQ ID NO: 2$: 

(it SEQUEK2E CHARACTERISTICS: 
( ' (A) LENOTH: 52 but P*i« 
(6) WE: nucleic acid 

(C) STRAKDEDNESS: «ingle 

(D) TOPOLOGY: lin«*r 

<ii) KOLECUtI WPEi CONA 

(vi) ORIGINAL SOURCE: 

(A) ORCANISK: Hotno «apim« 

(Xi) SEQUENCE DESCRIPTION; SEQ ID NO: 25s 
GTAACAGAAC ATTACAAACC CTGCTCACTA ATGCCATCAC TACTTTCCTA AG 52 



60 
64 
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(2) INFORMATION FOR SIQ « 1,05 2 * * 
(i) SEQUENCE CHARACTERISTICS! 
W (A) LENGTH t 46 bate peirt 



B) TYPE: nucleic acid 
C STRAHDEDNESS : ainc-l* 

(0) TOPOLOGY: linear 

(ii) MOLECULE TTPE: cDNA 

(vi) ORIGINAL SOURCE I 

(A) ORGANISM: Homo e.piena 

(xl) SEQUENCE DESCRIPTION: SEQ ID NOi2*i 
CGATATTAAA CTCGTAATTT TCTTTCTAAX CTCATTTOGC CCACAC 46 
(2) INFORMATION FOR SEQ ID HO:27t 

ll\ SEQUENCE CHARACTERISTICS I 

(1) LENGTHS 40 baee pelre 

(B) TTPE J nueleie acid 

(C) STRANDEONESSi einole 

(D) TOFOLOGYi lineer 

(Li) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

1 1 {X) ORGANISM t HOOO sapient 

(xi) SEQUENCE DESCRIPTION: SEQ XO MO:27: 

GTATGTTCTC TATRGT6TAC ATCGTAGTCC ATGTTTCXAA ** 

<2) INFORMATION FOR SEQ ID NO:28: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: S6 bait pavrt 
IB) TTPE: nucleic acid 
<C) STRAKDEDNESS : tinyle 
(D) topology: linear 

(ii) MOLECULE TYPE; CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hooo eftpiena 

(xi) SEQUENCE DESCRIPTION: SEQ ID HOi2S: 
CATCATTCCT CTTCAAATAA CAAACCATTA TGGTTTATGT TCATTTTXTT TTTCAC S6 
(2) INFORMATION FOR SEQ ID NOi2»: 

(i) SEQUENCE CHARACTERISTICS: 
(X) LENGTH: 43 baaa paira 

(B) TYPE: nueleie «eid 

(C) STRANDEDNESS: tin9le 

(D) TOPOLOGY: linear 
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<ii) MOLECULE TTPE: cONA 

( vi) ORIGINAL S0V*C1: 

(A) ORGANISM: Ho*© «apl«n» 

, x i) SEQUENCE DESCRIPTION. SEQ XO 
GTAAGACAAA AATC7TTTTT AATGACATAC ACAATTACTC CTC 
(2 ) INFORMATION FOR SEQ ID HOt30i 
(i) SEQUENCE CHARACTERISTICS! 
11 (A) LTNCTli 40 b*w pairs 
IB) TTWi nucUic acid 
iC STRAHDSDNESSt single 
(p) TOPOLOGTi 

(ii) MOLECULE TTFti COMA 

/vi) ORIGINAL SOORCTl 

( (A) ORGANISM t Hooo «pl«» 

(xi, SEQUENCE OKOUKIOH: SEQ 10 K0s30s 
TTAGATGATT CTCTTTTTCC TCTTCCCCTT tttaaattag 
(2) INFORMATION FOR SEQ XO NO:31t 



ii) SEQUENCE CHARACTERISTICS 
W \X) LENGTH t 44 b*M P*i« 

(B) WPEi auelalc icld 

(C) STRANDEDIfXSS; 

(0) TOPOLOOYi iU«ir 

(ii) MOLECULE TYPE: cONA 

( vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hooo sapisni 

<xi) STOUXNCE DESCRIPTION: SEQ 10 N0:3I: 

CTATCTTTTT ATAACATCTA TTTCTTAAGA TAGCTCAGCT ATGA 

(2) INFORMATION FOR SEQ IC NO: 32: 

U) SEQUENCE CHARACTERISTICS: 
(A) XENCTB: 54 bast pairs 

(1) TYPE: nucleic acid 

(C) STRANDEONESS i amgl« 

(D) TOPOLOCX: llntar 

(ii) MOLECULE TYPE; cONA 

tvi) ORIGINAL SOURtt: 

1 (A) ORGANISM: Hem iapi«n« 



43 



40 



44 
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CCTTOCCTtC AWTTGNCTT TTXAATCXT. CTCTATTCTC TATTTAATTT ACA<= 
(2 , IHFORMATIOK FOR SEQ XD HO. 33: 

(ij SEQUIHCE CHAJUCTWISTIM: 
' ' (A) LXNCTH i 6S bAM pair. 

C STIUKOMKESis lUSlt 
{D| TOPOLOGY: linetr 

(ii) KOLtCCU TXM: cDHA 

ivi| OMOXHA1 SOTOCXt 

' (A) OAGAHISM: Homo Mp>.«>» 

<xi> 8S0WSKCZ DESCRIPTION! SEC ID *J:33: 
GTACTATTTA 0AATT7CACC TGT77TTCTT TTTTCTCTTT TTCTTTSA6G CACCCtCTCA . 

CTCTC 

(2) INFORMATION FOR SEQ 10 HO: 34: 

<i) SEQUENCE CHARACTERISTICS: 
( 1 <A) WW: 52 b«e pairt 

(B) TYPE: nueX«ic acid 
C) STAANOEDNESS: Single 

(D) TOPOLOGY i iin««r 

(ii) MOLECULE TYPE: cDNA 

(Vi) ORIGINAL SOURCE: 

1 (X) ORGANISM: Homo •*pi#n« 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 : 
GCAACTACTA TGATTTTATG TATAAATTAA TCTAAAATTG ATTAATTTCC AG » 
(2) INFORMATION FOR SEQ ID NO:3S: 

li> SEQUENCE CHARACTERISTICS: 

(A) LENGTH t 42 b*.0 p*i« 

(B) TTPEs nucloic *cid 
<C) STRANDEDHESSt •Ingle 
(D) TOPOLOGY: Un«*r 

<ii) MOLECULE TYPE i CDNA 

ivi» ORIGINAL SOURCE » 

1 (A) ORGANISMS Homo aapitns 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35t 
GTACCTTTGA AAACATTTAG TACTATAATfc TGAATTTCAT GT < 
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(2 ) INFORMATION FOR SEQ ID NO:3«' 
1 1 1 SEQUENCE CHARACTERISTICS : 

B tTK! nuci.lc 

C STRAK8E0NXSSJ -inflle 

(P) topology linear 

(ii) MOLECULE TYPE' CDHA 

ivi) ORIGINAL SOURCE: 

( ' (A) OROANXSH. H«do ..pi.nt 

(xi) SEQUENCE DESCRIPTION.- SEQ XO NO:3«: 

CCAACTCHAA TTACATOACC CATATTCAGA AACTTACTAC 

(2) IHFORKATION FOR SEQ 10 N0.37: 

(i) SEQUENCE CHARACTERISTICS I 
( ' (A) IXN6TH. 54 bAM p*i« 

(B) Wti nucl«te meii 

(C) BTRAJTOEDNESSs «xn?l.« 
(P) TOPOLOGY: ltMAT 

(ii) MOLECULE WE: cDMA 

( vi) ORIGINAL SOURCE: 

' ' {A» ORCANISK: Homo ««pi«ni 

(xi) SEQUENCE DESCRIPTION! SEQ 10 NO:37: 
CTATATATAO ACTTTTATAT TACTTTTAAA GTACASAATT CATACTCTCA AAAA 
(2) INFORMATION FOR SEQ ID NO: 38: 

<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH t 41 b*se p*irc 
IB) TTPE: nucl«ic tcid 

(C) STRANDEDNESS: fingU 

(D) TOP0LOCT: linttr 

(ii) MOLECULE TTPEi CDNA 

(vi) ORIGINAL SOURCE: 

1 ' (A) ORCANISKi Hoclo .apien* 

(xi) SEQUENCE DESCRIPTION i SEQ ID NO:38: 
ATTGTCACCT TAATTTTGTG ATCTCTTGAT TTTTATTTCA C « 

(2) INFORMATION FOR SEQ ID NO:39: 

fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 D«« P*i« 

(B) TTFEs nucleic acid 

(C) STRANMONESS: «ingl€ 

(D) TOPOLOC*: linear 
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(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo tapiens 

(xi) SEQUENCE DESCRIPTION: SEQ 13 NO: 39: 

TCCCCCCCTC CCGCTCTC 18 

(2) INFORMATION FOR SEQ ID MOi40: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
(8) TYPEt nueleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: COHA 

(vi) ORIGINAL SOURCE! 

<A) ORGANISM: Homo sapiens 

(Xi) SEQUENCE DESCRIPTION! SEQ 10 NO: 40: 

CCACCCCCCC CTCCCGTG 18 

(2) INFORMATION FOR SEQ ID NO:41: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEDNISS: eingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 

GTGAACGCCT CTCATGCTGC 20 

(2) INFORMATION FOR SEQ ID NOt42s 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 19 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEDNESS i single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hotno sapiens 
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<*i) SEQUENCE DESCRIPTION: SEQ XD M0.42« 

ACGTOCGCGG AOCXATCCX 

( 2) IN FORMAT ION TOR SEQ TO NO:43i 

(i) SEQUENCE OURACTXMSTICS: 
1 ' <A, LENGTH: 24 basa pairi 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ti) MOLECULE TIPE: cDNA 

( vi) ORIGINAL SOURCE* 

1 (A) ORGANISM: HoiDO sapient 

(Xi) SEQUENCE DESCRIPTION! SEC ID HO:43: 
ATCATATCTT ACCAAATCAT ATAC 
(2) INFORMATION TOR SEQ IP NO: 44: 

ti) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 23 baee pairs 

(B) TYPE: nucUic acid 

(C) STRANDEDNESS: tingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TTPE i CDNA 

(vi) ORIGINAL SOURCE; 

(A) ORGANISM t Homo sapiens 

(Xi) SEQUENCE DESCRIPTION i SEQ ID NO: 44: 
TTATTCCTAC TTCTTCTATA CAG 
(2) INFORMATION FOR SBQ ID NOt4*t 

(i) SEQUENCE CHARACTERISTICS i 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucieic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: Xinaar 

(ii) KOLECUt: TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo fapicns 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO:45: 
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(2) INHUMATION FOR SEQ ID MOi46: 

fi) SEQUENCE CHARACTERISTICS; 
|A1 LEKCTHi 20 batt pairs 

(B) TYPE: nucleic eeid 
fC) STMNOXOKXSS: tingle 

(0) TOPOLOGY: linear 

(ii) MOLECULE TTPt: cDNA 

(vi) ORIGINAL SOURCES 

(A) ORCANISMi Hocto sapient 

<xi) SEQUENCE DESCRIPTION: SCO 10 NO:46: 

2C 

TGCCCCCATC TTCTTCCTCA 

(2) ZKFORKATION TOR SEQ 10 KOi47: 

(i) SEQUENCE CHARACTERISTICS; 
(A) LENGTH: 22 btM pairs 

(1) TTPt i nucleic acid 

(C) STRANDEDNISS: tingle 

(D) TOPOLOGY: linear 

(iij molecule type* cdna 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM t Home sapient 

<Xi> SEQUENCE DESCRIPTIONS SEQ ID NO!47i 

ACA7TACCCA CAAAGCTTGC AA 23 

(2) INFORMATION FOR SEQ ID NO: 46 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH i 22 bate paire 
(I) TTPC: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY t linear 

<ii) MOLECULE TTPE: CDNA 

(▼1) ORIGINAL SOURCE i 

(A) ORGANISM* Homo sapient 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO; 48: 

ATCAACCTCC AGTAAGAAGG TA 22 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 19 bate pairt 
<B) TYPE i nucleic acid 

(C) STRANOEDNESS: tingle 

(D) TOPOLOGT: linear 
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MOLECULE VCTt: CDNA 

ivil ORIGINAL SOURCE i 

( 1 (X) ORGANISM. HoiRO M P i«ns 

(Kl) SEQUENCE DCSCRIPTION: SEQ 20 N0.49. 
TGCCCCTCC7 CCCTTCTTG 
(2) INFORMATION FOR SEQ ID NO;50t 
<i) SEQUENCE CHARACTERISTICS: 

(8) TYPE: mielsic acid 
(C) STRANDEDNESS: Singlt 
(0) TOPOLOGY: lin«*r 

(ti) MOLECULE WPS: eONA 

( vi) ORIGINAL SOURCE: 

1 ' (X) ORGANISM; Ho©o S*pisn« 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO.SO: 

1 20 
CCCCCTTCCT TTCTCACCAC 

(2) INrORMATIOR MR SEQ ID NO: 51: 

it) SEQUENCE CHARACTERISTICS: 
<A) LlMCTKi 21 *>••• P*i« 
/g) TYPE: nucltic seid 
(C) STRANDEOWCSSJ tinQle 
(p) TOPOLOGY: lin«4T 

(U) MOLECULE TtPEt CDNA 

(ri) ORIGINAL SOURCE t 

* (A) ORGANISM: How ttpiens 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: SI: 

21 

TTTTCTCCTG CCTCTTACTG C 

(2) INFORMATION POR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS; 
1 ' (A) LENGTH: 20 P*irs 

(B) TYPE: nuelaic acid 

(C) STRANDEDNESS: ■ingle 

(D) TOPOLOGY: lin*« 

(ii) MOLECULE TYPE: cDNA 

( vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hem© iipient 



WO 92/13103 PCT/US92/00376 
(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 52; 

20 

ATGACACCCC CCATTCCCTC 

(2) INFORMATION FOR SEQ 10 NO: 53: 

(i) SEQUENCE CHARACTERISTICS; 
(A) LENGTHS 24 base pairs 
(8) TTPEi nucUic acid 
(C) STRAKOEDNESS: single 
(0) TOPOLOGT: linear 

(ii) MOLECULE TYPE: CONA 

(vi) ORIGINAL SOURCE I 

(A) ORGANISM: Howo sapiens 



(Xi) SEQUENCE DESCRIPTION I SEQ 10 NO: 52: 
CCACTTAAA0 CACATATATT TACT 
(2) INFORMATION rOR SEQ 10 NOtS4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nuelele acid 
<C) STRANDBDNESS: iingl* 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Rooe sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 54; 
GTATCGAAAA TAGTGAAGAA CC 
(2) INFORMATION FOR SEQ ID NO:SS: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic aeid 

(C) STRANOEDNESS t single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CONA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Bono sapiens 



(Xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 55: 
TTCTTAAGTC CT G TTIITCT TTTG 



24 
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(2, INFORMATION FOR SEQ ID "0:5«i 

Ci) SEQUENCE CHARACTERISTICS! 

(A) UWCTHs 2J bm ptir* 

(B) TYPEi nucleic acid 

(C) STRAWTOKISS: sinale 
(0} TOPOLOGY: linear 

(ii) MOLECULE TTPtJ CDMA 

/vi) original sotmai 

(A) ORGANISM: Bono aapitne 

(xi) SEQUENCE DESCRIPTION: SCO ID NOfSS: 
TTTAGAACCT TTTTTGTCTT CT6 23 
(2) INFORMATION FOR SEQ ID NOtS7t 

U) SEQUENCE CHARACTERISTICS: 

(A) LBNCTHi 24 bin pal" 

(B) TYPB: nucleic acid 

(C) STRANDEDNESS : einglt 
(0) TOPOLOGY s linear 

(ii) MOLECULE TYPE: cDHft 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Boao aapitne 

(Xi) SEQUENCE DESCRIPTION i STQ ID NO: 57: 
CTCAGATTAT ACACTAAGCC TAAC 24 
(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) XJNCTHt 22 baea pairs 

(B) TYPE: nucltic acid 

(C) STRAKDBDIIESSs ting la 

(D) TOFOLOCX: linaar 

(ii) MOLECULE TYPXi CDNA 

(Ti) ORIGINAL SOURCE: 

(A) ORGANISM: Booo aapiana 

(xi) SEQUENCE DESCRIPTION i SEQ ID NO: 58: 
CATGTCTCTT ACAGTAGTAC Ck 22 
<2) INFORMATION FOR SEQ ID N0:*9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE i nucltic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(11) MOLECULE TYFEi cONA 

(▼1) ORIGINAL SOURCE! 

(A) ORGANISM: Hon* aapiana 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5»: 



ACCTCCAACC GTACCCAAGC 20 

(2) INFORMATION FOR SEQ 10 NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 
|A) LENGTH: 27 *>*•• pairi 
(B) TTPti nuelaie acid 
<C) STRAHDEDMESSt iingla 
(D) TOPQLOGtt linear 

<ii) MOLECULE TTPE* cDNA 

(Ti) ORIGINAL SOURCE: 

(A) ORGANISM: HO*o aapicna 

(*i) SEQUENCE DESCRIPTION: SEQ 10 NO:60: 
TAAAAATGGA TAAACTACAA TTAAAAC 27 
(2) INFORMATION FOR SEQ 10 NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 bate pair* 

(B) TTPE: nuelaie acid 

(C) STRAMDEDME5S: iingla 
(0) TOPOLOGY; linear 

(11) MOLECULE TTPE: eONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mono sapient 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 : 

AAATACAGAA TCATGTCTTG AAGT 24 

(2) INFORMATION FOR SEQ ID NO: 62: 

(1) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 23 b«a« pair* 
(8) TTPE: nucleic acid 
(C) STRANDED NESS I tingle 
(0) TOPOLOGY: linetr 

(11) MOLECULE TTPE i CONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo aapiena 
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(Xi) SEQUENCE DESCRIPTION; SEQ ID NO: 62: 
ACACCTAAAC ATGACAATTT CAC 
(2 ) INFORMATION FOR SEQ ID *0:€3: 
ii) SEQUENCE CHARACTERISTICS: 
' (A) LENGTHS 24 blie P**ra 

(B) TYPE: nucleic acid 

(C) STRAND CONES S : tingle 
(0) TOPOLOGY i linear 

(ii) MOLECULE TYPEs cDNA 

tvi\ ORIGINAL SOURCES 

< ' (A) ORGANISM. Homo eapieni 

<xi> SEQUENCE DESCRIPTION: SEQ ID NOi63t 
TAACTTACAT AGCAGTAATT TCCC 
(2) INFORMATION FOR SEQ ID NO:64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 bate paira 

(B) TTPEs nucleic acid 

(C) STRANDEDNES3: e ingle 

(D) TOPOLOGY i linear 

(ii) KOLECULE TXPEi cDNA 

rvi) ORIGINAL SOURCE! 

(A) ORGANISM : Hone aapiene 

(Xi) SEQUENCE DESCRIPTION! SEQ ID NO: 64: 
ACAATAAACT GGACTACACA AGG 
(2) INFORMATION FOR SEQ ID NO:6£: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 23 base peire 

(B) TYPE: nucleic eeid 

(C) STRAND ED NESS : aingie 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

rvi) ORIGINAL SOURCE: 

1 (A) ORGANISM: Homo eapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOt65r 
ATACGTCATT CCTTCTTGCT GAT 
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2) INFORMATION TOR HQ " NO! 66 : 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 24 »as« pairf 

(B) WIi mici«lc acid 

(C) STAANDEDNESSi single 

(D) TOPOLOGY i linsar 

(ii) MOLECULt TYPE: CDNA 

(vi) ORIGINAL SOURCES 

(X) ORGANISM: Homo •apiens 



<xi) SEQUENCE DESCRIPTION » SEQ ID HOt66t 
TGAATTTTAA TGGATTACCT AGGT 
(2) INFORMATION rOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH! 25 baae pairs 

(B) TYPE: nucltie acid 

(C) STRANDEDNESS: Single 

<0) topology: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE s 

<A) ORGANISM i Homo aapiena 



(Xi) SEQUENCE DESCRIPTION! SEQ ID NOs67i 
CTTTTTTTGC TTTTACTCAT TAACG 
(2) INFORMATION FOR SEQ ID NOt68t 

(X) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 27 base pairt 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: ainQle 
(0) TOPOLOGY j linear 

(ii) MOLECULE TYPE: cDNA 

(ri) ORIGINAL SOURCE: 

(A) ORGANISM: Komo sapiens 



(Xi) SEQUENCE DESCRIPTION! SEQ ID NO: 68: 

TCTAATTCAT TTTATTCCTA ATAGCTC 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 baae pain 
(8) TYPE: nucleic acid 
(C) STRANOEDNESS t single 
<D) TOPOLOGY : linear 



WO 92/13103 



-102- 



PCT/LS9:/00376 



(U) MOttCULE TYPE: CDHA 

tvi) ORIGINAL SOURCE: 

( <A) OAGANISMi H«mo .Api.nf 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO; 69: 
CCTACCCATA CTATGATTAT TTCT 
(2) INFORMATION TOR SEQ "> " Os70s 

(i) SEQUENCE CHARACTERISTICS: 
1 ' (A) LENGTH: 24 b"« P**" 

(I) TYPE: nucUic *cid 

(C STRANDEDNESS: «ingl« 

(D) TOPOLOGY: lin«*r 

(li) MOLECULE TYPE: eDNA 

( vi) ORIGINAL SOURCE t 

(A) ORGANISM: Homo wpiens 

<*i> SE00ENCE DESCRIPTIOK: SEO ID NO: 70: 

24 

CTACCTATTT TTATACCCAC AAAC 

(2) INFORMATION FOR SEQ 10 MO; 71: 

li) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 33 b*« J>AlTf 
(li TYPE: nucltic AC id 

(C) STRANDEDNXSS: «ingl« 

(D) TOPOLOGT: li»«AT 

(ii) MOLECULE TTPE: CONA 

fvi) ORIGINAL SOURCE: 

(X) ORGANISM: Homo fl«?iene 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

AAGAAAGCCT ACACCATTTT TGC 

(2) INFORMATION FOR SEQ ID NO: 72: 

(I) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 23 base p*a.r« 
(8) TTFE: nucltie «cid 
(C) STRANDEDNESS: single 
(0) TOPOLOOY; Un«*r 

(ii) MOLECULE TYPE: cDNA 

(Yi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo »*pa.«n« 
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{% L) SEQUENCE DESCRIPTION; SEQ ID »Oi«: 

23 

GATCATTCTT ACAACCATCT ICC 

( 2) INFORMATION FOR SEQ ID NO: 73: 

UJ SEQUENCE CHARACTERISTIC*! 
(A) LENGTH: 24 bat« pair* 
(I) TYPE: nucleic acid 
(C) STRANDEDNESS: sincle 
(p) TOPOLOGY: Una** 

(ii) MOLECULE TYPE: CONA 

(Vl) ORIGINAL SOURCE: 

(A) ORGANISM; Homo aapiana 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 
ACCTATACTC TAAATTATAC CATC 24 
(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 ba»« pairs 

(B) TYPE; nucltie acid 

(C) STRANDEDMESSt aingla 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE t 

(A) ORGANISM! Homo aapiana 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
CTCATGGCAT TAGTCACCAC 20 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 baaa pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDHESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Ti) ORIGINAL SOURCE: 

(A) ORGANISM: Homo eapiene 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
ACTCCTAATT TTCTTTCTAA ACTC 
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(2) INFORMATION FOR SEQ XD NO: 76: 




<xi) SEQUENCE DESCRIPTION: SEQ ID NO; 76: 
TGAAGGACTC CCATTTCACG C 
(2) INFORMATION FOR SEQ ID NO:77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 b«e p*iri 

(B) TYPE: nucleic 4Cid 

(C) STRANDEDNESSt « ingle 

(D) TOPOLOCT: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: 
TCATTCACTC ACACCCTCAT GAC 
(2) INFORMATION TOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS i 

(A) LENGTH i 22 P*i™ 

(B) TXPE: nucleic ecid 

(C) STRANDEDNESS: tinglfe 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) OrOAKXSMi Howo »apLen« 



(Xi) SEQUENCE DESCRIPTION; SEQ ID NO: 78: 
GCTTTGAAAC ATGCACTACO AT 
(2) INFORMATION FOR SEQ ID NOi79i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 beee pairs 

(B) TYPE: nucleic acid 

(C) STRAKDEDNESS: eingie 
(p) TOPOLOGY ; linear 




CE: 

i Homo t*pien« 
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(11) MOLECULE TYPE: CDNA 

(vl) ORIC2NAL SOURCE: 

(A) ORGANISM: Hono sapient 



(Xl) SEQUENCE DESCRIPTION; SEQ ID NOi79l 
AAACATCATT GCTCTTCAAA TXAC 
(2) INFORMATION FOR SEQ 20 NO; SO; 

(1) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 24 bate pelrt 

(B) TYRE: nucleic acid 

(C) STRAND ED NESS I Single 

(D) TOPOLOGY: linear 

(11} KOLECULE TYPE: CDNA 

(vl) ORIGINAL SOURCE: 

(A) ORGANISM: Heme sapient 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
TAC CATC ATT TAAAAATCCA CCAG 
(2) INFORMATION TO* SEQ ID NO: 81: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 bate palra 

(B) TYPE; nucleic acid 

(C) STRANDED NESS : tingle 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: CDNA 

(vl) ORIGINAL SOURCE: 

(A) ORGANISM: Homo ftp lent 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NOsSl; 

GATGATTGTC TTTTTCCTCT TCC 

(2) INFORMATION rOR SEQ ID NO: 62: 

(1) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 24 bin palra 
(8) TYPE: nucleic ecid 

(C) STRAND ED NESS : tingle 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: COMA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo aapieni 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:B2: 
CTGACCTATC TTAACAAATA CATC 
(2) INFORMATION FOR SEQ ID NO: $3: 

(i) SEQUENCE CHARACTERISTICS: 
<A> IXNCTHi 25 b4M p*trt 
(B) TYPE: nucleic ACitJ 
<C> STRANDEDNESS 1 mlngl* 
(D) TOPOLOGY: lineer 

(Li) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION! SEQ ID 

TTTTAAATGA TCCTCTATTC TGTAT 

(2) INFORMATION rOR SEQ ID NO: $4: 

(i) SEQUENCE CHARACTERISTICS I 
(A) LENGTH i 24 t>A»« p*ir« 
(8) TYPE: nucleic a; Id 

(C) STRAHDEDNESSt eingle 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORCANISH! Homo AApien* 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOiS4: 
ACACACTCAC ACCCTGCCTC AAAC 
(2) INFORMATION FOR SEQ 10 NOiSSi 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 bAee pAirs 
(■) TYPE: nucleic Add 

(C) STRAND ED NESS : tingle 

(D) TOPOLOGY: line AT 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: HOOO AApieni 



(Xi) SEQUENCE DESCRIPTION i SEQ ID NO: 85: 
TTTCTATTCT TACTCCTASC ATT 




CE! 

: Homo AApi«n« 
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(2) INFORMATION FOR SEQ 10 NO: 86: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 bate pairt 

(B) TYPE: nucltic acid 

(C) STRANDEDNESS: tingle 

(D) TOPOLOGY: lintir 

(ii) MOLECULE TYPE: cDNA 

(vl) ORIGINAL SOURCE: 

(A) ORGANISM! Homo sapient 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: 
ATACACAGGT AAGAAATTAG CA 
(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairt 

(B) TYPEt nucl«ie acid 

(C) STRAHDIDNES5: tingle 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapient 



(Xi) SEQUENCE DESCRIPTION: SZQ ID NO:87: 
TAOATGACCC ATATTCTCTT TC 
(2) INFORMATION FOR SEQ ID NOiSS: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 22 bate pair* 

(B) TtPtt nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGYt linear 

(ii) MOLECULE TYPE: CDNA 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo tapicna 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

CAATTACCTC TTTT I GACAC TA 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
(8) TYPE: nucleic aeid 

(C) STRAND EDNE5S: tingle 

(D) TOPOLOGY: linear 
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<ii) MOLECULE TYPE! cDNA 

/vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO:89: 

22 

G7TACTGCAT ACACATTCTC AC 

(2) INFORMATION FOR SEQ 10 NO: 90: 

ti) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 23 base P*lr» 

(B) TYPE: nueleie acid 
<C) STRAND ED KISS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Home sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

OCTTTTTCTT TCCTAACATG AAG 23 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 21 bait pair* 
(I) TTPE: nucleic ecid 

(C) STRANDEDKESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
TCTCCCACAC GTAATACTCC C 21 
(2) INFORMATION FOR SEQ ID KO»92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 F*i« 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Homo sapient 
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<xi> StOOEKCC DESCRIPTION: SEQ ID NOS92: 
CCTAGAACTG AATCGCGTAC C 
(2) INFORMATION FOR SZQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 btit psiri 

(B) TYPE: nucleic tcid 
<C) STRANDEDNESS: tincle 
<D) TOPOLOGY* linear 

(ii) MOLECULE TYPE: cOKA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : Homo »4pi«n« 



<xi) SEQUENCE DESCRIPTION: SEQ ID HO: 93: 

CACCACAAAA TAATCCTOTC CC 

(2) INFORMATION TOR SZQ ID NO:94i 

(i) SEQUENCE CHARACTERISTICS ; 
(A) LENGTH: 24 b*»e pairs 
(8) TYPEi nuclaie acid 

(C) STRANDEDNtSS: *lng\% 

(D) TOPOLOGY i linaar 

(li) MOLECULE TYPE: CDNA 

(Yi) ORIGINAL SOURCES 

(A) ORGANISM: Hone sapiana 



<xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 94: 
AT IT7 CT TAG TTTCATTCTT CCTC 
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CLAIMS 

1. A method of diagnosing or prognosing a neoplastic tissue 
of a human, comprising: 

detecting somatic alteration of wild-type APC gene cod- 
ing sequences or their expression products in a tumor tissue isolated 
from a human, said alteration Indicating neoplasia of the tissue. 

2. The method of claim 1 wherein the expression products 
are mRNA molecuies. 

3. The method of claim 2 wherein the alteration of 
wild-type APC mRNA is detected by hybridization of mRNA from said 
tissue to an APC gene probe. 

4. The method of claim l wherein alteration of wild-type 
APC gene coding sequences is detected by observing shifts in 
electrophoretic mobility of single-stranded DNA on non-denaturing 
polyacrylamide gels. 

5. The method of claim 1 wherein alteration of wild-type 
APC gene coding sequences is detected by hybridization of an APC 
gene coding sequence probe to genomic DNA isolated from said tissue. 

6. The method of claim 5 further comprising: 

subjecting genomic. DNA isolated from a non-neoplastic 
tissue of the human to Southern hybridization with the APC gene cod- 
ing sequence probe; and 

comparing the hybridizations of the APC gene probe to 
said tumor and non-neoplastic tissues. 

7. The method of claim 5 wherein the APC gene probe 
detects a restriction fragment length polymorphism. 

8. The method of claim 1 wherein the alteration of 
wild-type APC gene coding sequences is detected by determining the 
sequence of ail or part of an APC gene in said tissue using a polymerase 
chain reaction, deviations in the APC sequence determined from that 
of the sequence shown in Figure 7 (SEQ ID NO.: l) suggesting neoplasia. 

9. The method of claim l wherein the alteration of wild- 
type APC gene coding sequences is detected by identifying a mismatch 
between molecules (l) an APC gene or APC mRNA isolated from said 
tissue and (2) a nucleic acid probe complementary to the human wild- 
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type APC gene coding sequence, when molecules (l) and (2) are hybrid- 
ized to each otner to form a duplex. 

10. The method of claim 5 wherein the APC gene probe 
hybridizes to an exon selected from the group consisting of: (1) 
nucleotides 822 to 930; and (2) nucleotides 931 to 1309; (3) nucleotides 
1406 to 1545; and (4) nucleotides 1956 to 2256. 

11. The method of claim 1 wherein the alteration of wild- 
type APC gene coding sequences is detected by amplification of APC 
gene sequences in said tissue and hybridization of the amplified APC 
sequences to nucleic acid probes which comprise APC sequences. 

12. The method of claim 1 wherein the alteration of 
wllcHype APC gene coding sequences is detected by molecular cloning 
of the APC genes in said tissue and sequencing all or part of the cloned 
APC gene. 

13. The method of claim l wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
a deletion mutation. 

14. The metnod of claim l wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
a point mutation. 

15. The method of claim 1 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
an insertion mutation. 

16. The method of claim 1 wherein the tumor tissue is a 
colorectal tissue. 

17. The method of claim 6 wherein the non-neoplastic tissue 
isolated from a human is from colonic mucosa. 

18. The method of claim l wherein the expression products 
are protein molecules. 

19. The method of claim 18 wherein the alteration of 
wild-type APC protein is detected by immunoblotting. 

20. The method of claim 18 wherein the alteration of 
wild-type APC protein is detected by immunocytochemistry. 
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21. The method of claim 18 wherein the alteration of 
wild-type APC protein Is detected by assaying for binding interactions 
between APC protein of said tumor tissue and a second cellular protein. 

22. The method of claim 21 wherein the second cellular pro- 
tein is selected from the group consisting of MCC protein, wild-type 
APC protein, and a C protein. 

23. The method of claim 18 wherein the alteration of 
wild-type APC protein is detected by assaying for phospholipid 
metabolites. 

24. A method of supplying wild-type APC gene function to a 
cell which has lost said function by virtue of a mutation in an APC 
gene, comprising: 

introducing a wild-type APC gene into a cell which has 
lost said gene function such that said wild- type APC gene is expressed 
in the cell. 

25. The method of claim 24 wherein the wild-type APC gene 
introduced recom bines with the endogenous mutant APC gene present 
in the cell by a double recombination event to correct the APC gene 
mutation. 

26. A method of supplying wild-type APC gene function to a 
cell which has altered APC function by virtue of a mutation in an APC 
gene, comprising: 

introducing a portion of a wild-type APC gene into a cell 
which has lost said gene function such that said portion is expressed in 
the cell, said portion encoding a part of the APC protein which is 
required for non-necpiastie growth of said cell. 

27. A method of supplying wild-type APC gene function to a 
cell which has altered APC function by virtue of a mutation in an APC 
gene, comprising: 

applying human wild-type APC protein to a cell which has 
lost wild-type APC function. 

28. A method of supplying wild-type APC gene function to a 
cell which has altered APC gene function by virtue of a mutation in an 
APC gene, comprising: 
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introducing into the cell a molecule which mimics the 
function of wild-type APC protein. 

29. A pair of single stranded DNA primers for determination 
of a nucleotide sequence of an APC gene by polymerase chain reaction, 
the sequence of said primers being derived from chromosome Sq band 
21, wherein the use of said primers In a polymerase chain reaction 
results in synthesis of DNA having all or part of the sequence shown in 
Figure 7. 

SO. The primers of claim 29 which have restriction enzyme 
sites at each 5* end. 

31. The pair of primers of claim 29 having sequences corre- 
sponding to APC introns. 

32. A nucleic acid probe complementary to human wild-type 
APC gene coding sequences. 

S3. The nucleic acid probe of claim 31 which hybridizes to an 
exon selected from the group consisting of: (1) nucleotides 822 to 930; 
and (2) nucleotides 931 to 1309; (3) nucleotides 1406 to 1545; (4) 
nucleotides 1956 to 2256. 

34. A kit for detecting alteration of wild-type APC genes 
comprising a battery of nucleic acid probes which in the aggregate 
hybridize to all nucleotides of the APC gene coding sequences. 

35. A method of detecting the presence of a neoplastic tissue 
in a human, comprising: 

detecting in a body sample isolated from a human alter- 
ation of a wild-type APC gene coding sequence or wild-type APC 
expressio; product, said alteration indicating the presence of a 
neoplastic tissue in the human. 

36. The method of claim 35 wherein said body sample is 
selected from the group consisting of serum, stool, urine and sputum. 

37. A method of detecting genetic predisposition to cancer, 
including familial adenomatous polyposis (FAP) and Gardner's Syndrome 
(GS), in a human comprising: 

detecting a germline alteration of wild-type APC gene 
coding sequences or their expression products in a human sample 
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selected from the group consisting of blood and fetal tissue, said alter- 
ation indicating predisposition to cancer. 

38. The method of claim 37 wherein the expression products 
are mRNA molecules. 

39. The method of claim 38 wherein the alteration of 
wild-type APC mRNA is detected by hybridization of mRNA from said 
tissue to an APC gene probe. 

40. The method of claim 37 wherein alteration of wild-type 
APC gene coding sequences is detected by observing shifts in 
electrophoretic mobility of single-stranded DNA on non-denaturing 
polyacrylamide gels. 

41. The method of claim 37 wherein alteration of wild-type 
APC gene coding sequences is detected by hybridization of an APC 
gene coding sequence probe to genomic DNA isolated from said tissue. 

42. The method of claim 41 wherein the APC gene coding 
sequence probe detects a restriction fragment length polymorphism. 

43. The method of claim 37 wherein the alteration of 
wild-type APC gene coding sequences is detected by determining the 
sequence of all or part of an APC gene in said tissue using a polymerase 
chain reaction, deviations in the APC sequence determined from the 
sequence of figure 7 suggesting predisposition to cancer. 

44. The method of claim 37 wherein the alteration of wild- 
type APC gene coding sequences is detected by identifying a mismatch 
between molecules (1) an APC gene or APC mRNA isolated from said 
tissue and (2) a nucleic acid probe complementary to the human wild- 
type APC gene coding sequence, when molecules (1) and (2) are hybrid- 
ized to each other to form a duplex. 

45. The method of claim 41 wherein the APC gene probe 
hybridizes to an exon selected from the group consisting of: 
(1) nucleotides 822 to 930; and (2) nucleotides 931 to 130S; (3) 
nucleotides 1406 to 1545 and (4) nucleotides 1956 to 2256. 

46. The method of claim 37 wherein the alteration of wild- 
type APC gene coding sequences is detected by amplification of APC 
gene sequences In said tissue and hybridization of the amplified APC 
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sequences to nucleic acid probes which comprise _APC gene coding 
sequences. 

47. The method of claim 37 wherein the alteration of 
wild-type APC gene coding sequences is detected by molecular cloning 
of the APC genes in said tissue and sequencing all or part of the cloned 
APC gene. 

48. The method of claim 37 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
a deletion mutation. 

49. The method of claim 37 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
a point mutation. 

50. The method of claim 37 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
an Insertion mutation. 

51. The method of claim 37 wherein the expression products 
are protein molecules. 

52. The method of claim 51 wherein the alteration of 
wild-type APC protein is detected by immunoblotting. 

53. The method of claim 51 wherein the ^iteration of 
wild-type APC protein Is detected by immunocytochemistry. 

54. The method of claim 51 wherein the alteration of 
wild-type APC protein is detected by assaying for binding interactions 
between APC protein isolated from said tissue and a second cellular 
protein. 

55. The method of claim 54 wherein the second celli-lar pro- 
tein is selected from the group consisting of MCC protein, wild-type 
APC protein and a G protein. 

56. A method of screening for genetic predisposition to can- 
cer, including familial adenomatous polyposis (FAP) and Gardners Syn- 
drome (GS), in a human comprising: 

detecting among kindred persons the presence of a DKA 
polymorphism which is linked to a mutant APC allele in an individual 
having a genetic predisposition to cancer, said Kindred being 
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genetically related to the individual, the presence of said polymorphism 
suggesting a predisposition to cancer. 

57. A preparation of the human APC protein substantially 
free of other human proteins, the amino acid sequence of said protein 
corresponding to that shown In Figure 3 or 7 (SEQ ID NO: 1). 

38. A preparation of antibodies lmmunoreactive with a 
human APC protein and not substantially immunoreactive with other 
human proteins. 

59. A method of testing therapeutic agents for the ability to 
suppress a neoplasticauy transformed phenotype, comprising: 

applying a test substance to a cultured epithelial cell 
which carries a mutation in an APC allele; 

determining whether said test substance suppresses the 
neoplasticauy transformed phenotype of the cell. 

60. The method of claim 59 wherein the cultured epithelial 
ceil has been genetically engineered to carry the mutation in the APC 
allele. 

61. A method of testing therapeutic agents for the ability to 
suppress neoplastic growth, comprising: 

administering a test substance to an animal which carries 
a mutant APC allele in its genome; 

determining whether said test substance prevents or sup- 
presses the growth of tumors. 

62. A transgenic animal which carries a mutant APC allele 
from a second animal species in its genome. 

63. An animal which has been genetically engineered to con- 
tain an insertion mutation which disrupts an APC allele in its genome. 

64. A cDNA molecule which encodes a protein having the 
amino acid sequence shown in Figure 3 or 7 (SEQ ID NO: 7 or i). 

65. An isolated DNA molecule which encodes a protein having 
the amino acid sequence shown in Figure 3 or 7 (SEQ ID NO: 7 or 1). 

66. A yeast artificial chromosome which is known as 37HG4. 
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TABLE HA 

Germiine mutations of the A PC pene in FAP and GS Patients 



NUCLEOTIDE AMINO 

EXTRA-COLONIC 
PATIENT COPON CHANCE 
DISEASE 



Otttou 
34 
34 
Tuaor 
21 
Osttou 
<0 



219 

301 
301 

413 

712 



TCA->TCA 

CGA->TGA 
CGA->TSA 

CCC->TCC 

TCA->TCA 



Ai9->Cyi 



ACID 



CHANCE ACE 



S«r->5to? jg lUA4U«l«r 



A*f->8tcp 



46 

27 DcsaolO 
24 lUodibuUx 



Str->€top J7 lUndlbulif 



3744 


243 


CACAO- >CAC 


. «pl Jet -Junction 


34(0 


301 


CCA->TC* 


Ary->Stop 


3127 


456 


CTTTCA->CTTCA 


frtMiMft 


3712 


500 


T->C 


tyr-»$top 



* The nutated nucleotides are underlined. 
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TABLE IIB 



Somatic Muacions in Sporadic CRC Pitienu 

^hewt codon' E" rt - goTTPg CHANCt a HJW ACID CHANGE 

OS MCCI2 OAC/|a*|«-> (Spli« Donor) 

CAC/|tta|i 

Tl« MCC U5 ete»i/0<5A-> (Splice Accepter) 

Itat/GCA 

T<7 MCC 267 COO->q[G ArjoLeu 

Til MCC 490 TCO->TIO Ser->Uu 

T35 MCC5« CCX5->CAO Ari»Oli 

T5I MCC 691 OCT->OTT AU.>V*I 

TJ4 AfC 211 CCACT-»CCCAJ5£CACT (Insertion) 

T27 APCJ3J CCAoJOA Af|->Stop 

T133 APC W CAA/|tia r >CAA/uu (Splice Donor) 

7201 APCIJ3I CAO->lAC Cle-»S«op 



For splice site nutations, the eodon nearest to the mutation a listed 

The underlined nucleotides were mutant; $m»ll case letters represent introns. Urge eve letters represent esoni 
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TABLE ID 



StQuonccs of Pnm«fi Umo tar t$C> Awyw 



Oil 

i ^*iantcm«u4mmi€ lE-niTrccm-m i mu m* 



nfii 

j w» rifTtLMiuu rmtrnff 
* t^fTetTAAcroetfttrmfwii w nmiimiimminii 




-I 



tmatrranA* 

hi m i iii 1 1 1 nj i m j n ■ 

imnrn n i in ii mn ii miiin i ii mi i n i 




•i 

^ ■ » * f ii r %4Mtfrtttft* ^muttmMifmittTRic. 

Al pun*? r# t$%C i« 1' 10 T tffocran. Th« dry prim*- n 
pof f o< th« «to* (t tmoiiA«t: i#cdtc snm* fl« 7* *• tie 
* tfut uo wtfitr. «t*» v< :«m:f.#d >y in tsitm 

UP ••riM«U 1M • 21 Mt3 v a* rut :-*#r ;«cw«*cg: M r*p/«*tn 
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TABLE IV 




Coomawj: f ' V E ' T P 4 CP S R ' $ S L J S L S 



1282: YCVEOTM CFSACSSISSLS 

1378: HYV06TPLMFSRCTSVSSLO 

U92: FATE$TPOGF$CSSSlSAL$ 

1S43: YCVEGTFt NFSTATSLSOIT 

1*4* T P I EGTPrCFSANOSLSSLO 

19i3: F A I ENTP VCPSHNSJ L SSLS 

20'-2: PMV£9TPVCP5RNSS15$IS 



Nwfflb«ri itncu fast ammo ici« er «ae.*i ncm Thi conuntut 
sequenci it yt« up ri'i*e:i a r.ijonry ammo a::« u t gtvtn position. 
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Co** 4 



COMQ* 



B c 



YAC4 



rEA 



arose*} 



Contigl 



Mvkiff r« *l* 



G«fV4t 

Marfctrs 



TBI 



Contig 2 



up 



UC40 



YACi 



sun 



^ -21- 



Contig'J 



YACs 



9 • 



o las B zaa a&»ioa*i 
r j 



H3UPS 1 
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TBI Amino Aczd Sequence 



YAPVYV6SGR APRHPAPAAH HPRRPDGFDG L6YRGGARDE QGF6GAFPAR SFS76S0L6H (0 
WVTTPP0IP6 SRNLHWGEKS PPYGVPTTST PYEGPTEEPF SSGGGGSVQG QSSEOlNRFA 120 
GFGIGLASLF TENVLAHPCI VIRRQCQVNY HAQHYHLTPF TYINIHYSFN KTOGPRAUflC 180 
6MGSTFIVQG VTLGAEGIIS EFTPLPREVL HKWSPKQIGE HLLLKSITYV VAHPFYSASL 240 
IETVQSEIIR DNTGILECVK EGIGRVXGMG VPHSKRLIPL ISLIFPTVLH GVLHYIISSV 300 
IQKFVUXLK RICTYKSNLAE STSPVQSHLO AYFPELIANF AASLMDVIL YPLETVLHPL 360 
filfigRTXXO MTnifiYPVL» THTQYE6HBD CIMTTRQEE6 VFGFYKGFGA VIIQYTLHAA 420 
VIQITKIIYS TLLQ 434 

T62 Amino Acio Sequence 

ELRRFORFLM EKNCMTOLLA KLEAJCTGVNR SFIAL6VIGL VALYLVFGYG ASLLCKLI6F 60 

GYPAYISXKA IESPNODOT QWLTYWYY6 VFSIAEFFSO IFL5WFPFYY IUCC6FLLWC 120 

KAPSPSHGAE ILYKRIIRPF FUHESQKDS WWLKDKAK ETADAITKEA KKATVNLLGE 180 

EJOCST 185 
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MAAASYOQLL KQVEALKHEN SNIRQEIEON SNHITH.ETE ASNMKEVIKQ LQ6SIE0EAM 60 
ASSGQZOLLE RUCEINLOSS NFPGVKIRSK MSLRSYCSRE GSVSSRSGEC SPVPMGSFPR 120 
RGFVNGSRES TGYIEEIEKE RSILIAOLDK EEKEKOVYYA QLQNLTKRIO SUTENFSIQ 180 
TOMTRRQLEY EARQIRVAME EQLGTCQDME KRAQRRIARI QQIEKOILRI RQLIQSQATE 240 
AERSSQNKHE TGSHOAERQN EGQGVGEXNH ATSGNGQGST TRMOHETASV LSSSSTHSAP 300 
RRLTSHLGTK VEMVYSLLSM LGTHDKDOMS RTLLANSSSQ OSCISKRQSG CLPLLIQLLH 360 
6N0KDSVLLG NSRGSKEARA RASAALHNII HSQPOOKRGR REIRVLHLLE QIRAYCETCW 420 
EWEAHEPGM DQDICNPHPAP VEKQICPAVC VLMO.SFOEE HRHAHNEIGG LQAIAELLQV 480 
OCEMYGLTNO HYSITLRRYA GMALTNLTFG OVANKATLCS MKGCMRALVA QLKSESEOLQ 540 
QVXASVLRNL SURAOVNSIOC TLREVGSVKA LMECALEVKJC ESTUCSVLSA LUNLSAHCTE 600 
NKAOXCAVDG ALAFLVGTLT YRSQTNTLAI IESGGGZLRN VSSLIATNEO HRQILRENNC 660 
LQTILQHLW HSLTXVSNAC GTLUNLSARN PKDQEALUDM GAVSMLKXLI HSKHKMXAMG 720 
SAAALRNLMA NRPAKYKOAN INSPGSSLPS LHVRKGXAIE AELOAOHLSE TFONZOMLSP 780 
KASHRSKQRH KQSLYGOYVF OTNRHDONRS DNFNTGNKTV LSPYLNTTVL PSSSSSRGSL 840 
DSSRSEKDRS LERER6IGLG NYHPATENPG TSSttGlOXS TTAAOIAKVH EEVSAIHTSQ 900 
EORSSGSfTE LHCVTOERNA IRRSSAAHTH SNTYNFTKSE NSNRTCSKPY AKLEYKRSSN 960 
OSLNSVSSSO GYGKRGQHKP SXESYSEOOE SKFCSTGOYP AOLAHKIKSA NHHOONOGEL 1020 
OTPXNYSUCY SOEQLNSGRQ SPSQNERUAR PKHXIEOEIK QSEQRQSRNQ STTYPVYTES 1080 
TDOKHUCFQ? HFGQQECVSP YRSRGAMGSE TNRVGSNHGX NQNVSQSLCQ EOOYEOOKPT 1140 
NYSERYSEEE QHEEEERPTN YSHCYNEEW HVDQPXOYSL KYATOXPSSQ KQSFSFSWS 1200 
SGQSSCTEHH SSSSENTSTP SSNAKRQNQL HPSSAQSRSG QPQXAATCKV SSINQETIQT 1260 
YCVEOTPXCF SRCSSLSSLS SAEOEXGCNQ TTQOPOSANT lOXAEXKEKX GTRSAEOPVS 1320 
EVPAVSQHPR TKSSRLQG5S LSSESARHKA VEFSSGAKSP SKSGAQTPKS PPEHYVQETP 1380 
LMFSRCTSVS SLOSFESRSX ASSVQSEPCS GKVSGXXSPS DLPOSPGQTX PMRSKTPPP 1440 
PPQTAQTKRE VPKWCAPTAE KRESGPKQAA VNAAVQRVQV LPOADTLUtF ATESTPOGFS 1500 
CSSSLSALSL OEPFIQKDYE UHHPPYQEN DNGNETESEQ PKESNENQEJC EAEKTXDSEK 1S60 
OLLDOSOOOO IEILEECXXS AMPTKSSRKA KXPAQTASKL PPPVARKPSG LPVYKLLPSQ 1(20 
NRLOPQJCMVS FTPGOOHPRV YCVEGTPINF STATSLSDLT XESPPNELAA 6E6VR6GAQS 1680 
GEFEKRflTIP TEGRSTOEAQ GGKTSSVTXP ELDDNKAEEG DXLAECXNSA NPKGKSHKPF 1740 
RVKKIMOQVQ QASASSSAPN KNQLDGJQOQC PT5PVKPIPQ NTEYRTRVRK KADSKXNLNA 1800 
ERVFSDNKDS KQNLKNNSK OFNOKLPNNE ORVRGSFAFD SPHHYTPIEG TPYCFSRNOS 1660 
LSSLDFDOOO VDLSREJCAEL RXAJCEMCESE AKVTSHTELT SNQQSANKTQ AXAKQPINR6 1920 
QPKPILQXQS TFPQSSKDXP ORGAATDEKL QNFAXENTPV CFSHNSSLSS LSOIOQENNN 1980 
KENEPXKETE PPOSQGEPSK PQASGYAPKS FHVEDTPVCF SRNSSISSLS XOSEDOUQE 2040 
CISSAMPKKK KPSRU60NE KHSPRNNG6X LGEDITLDLK 0XQRP0SEK6 LSPOSENFOV 2100 
KAXQEGANSX VSSLHQAAAA ACLSRQA5S0 IdSILSLKSG ISLGSPFHLT PDQEEKPFTS 2160 
NKGPRILKPG EKSTLETKKX ESESKGXKGG IQCVYKSLXTG KVRSNSEXSG QNKQPLGANH 2220 
PSISRGRTMX HXPGVRNSSS STSPVSKGP PUCTPASKSP SEGQTATTSP RGAKPSVKSE 2280 
LSPVARQTSQ IGGSSKAPSR SGSROSTPSR PAQQPLSRPX QSPGRHSXSP 6RNGISPPNK 2340 
LSQLPRTSSP STASTKSS6S GKHSYTSP6R OMSQQNLTKQ TGLSKNASSX PRSESASKGL 2400 
NQHNNGNGAN KKVELSRNSS TKSSGSESDR SERPVLVRQS TFIKEAPSPT LRRKLEESAS 2460 
FESLSPSSRP ASPTRSOAQT PVLSPSLPOH SLSTHSSVQA GGWRKLPPNL SPTXEYNOGR 2520 
PAKRHDXARS HSESPSRLPX NRSGTWKREH SKHSSSLPRV STWRRT6SSS SIlSASSESS 2580 
EKAKSEOEKH VNSISGTXQS KENQVSAKGT URKXKENEFS PTNSTSQTVS SGATNGAESK 2640 
TLIYQKAPAV SKTEOVWVRI EDCPINNPRS GRSPTGNTPP VXDSVSEKAN PNXKDSKDNQ 2700 
AKQNVGNGSV PHRTVGLENR LNSFIQVOAP OQKGTEIKPG QNNPVPVSET NESSXVERTP 2760 
FSSSSSSKHS SPSGTVAARY TPFNYNPSPR KSSAOSTSAR PSQXPTPVNN NTKKRDSKTD 2820 
STESSGTQSP KRHSGSYLVT SV 2842 
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APC 203 LGTCQDHE»WQRRIARIQQIEJa>ILRIRQl 233 

I : : II 111111:1 f I 
iul2 576 LTG/UCGIOLRALRRIARIEQGGTAISPTSPL 606 



B 

APC 453 KKLSFOEEHRJMMKELGGLQAXAELLQVO 481 

I : 11:1111: : : 
m3 KAChR 249 LYWRIYKETEBtTKELAGLQASGTEAETE 277 

II : I : llllll 
KCC 220 LYPNLAEERSRHEKEU6LREENESLTAM 248 

: II:: 11:11 II 
APC 453 NKLSFDEEHRKAMNE16GLQAXAELLGVD 481 
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